Strip notebooks from superfluous metadata

To avoid pointless conflicts while working with jupyter notebooks (with different execution counts or cell metadata), it is recommended to clean the notebooks before committing anything (done automatically if you install the git hooks with nbdev_install_git_hooks). The following functions are used to do that.

Utils

rm_execution_count[source]

rm_execution_count(o)

Remove execution count in o

clean_output_data_vnd[source]

clean_output_data_vnd(o)

Remove application/vnd.google.colaboratory.intrinsic+json in data entries

clean_cell_output[source]

clean_cell_output(cell)

Remove execution count in cell

clean_cell[source]

clean_cell(cell, clear_all=False)

Clean cell by removing superfluous metadata or everything except the input if clear_all

tst = {'cell_type': 'code',
       'execution_count': 26,
       'metadata': {'hide_input': True, 'meta': 23},
       'outputs': [{'execution_count': 2, 
                    'data': {
                        'application/vnd.google.colaboratory.intrinsic+json': {
                            'type': 'string'},
                        'plain/text': ['sample output',]
                    },
                    'output': 'super'}],
       
       'source': 'awesome_code'}
tst1 = tst.copy()

clean_cell(tst)
test_eq(tst, {'cell_type': 'code',
              'execution_count': None,
              'metadata': {'hide_input': True},
              'outputs': [{'execution_count': None, 
                           'data': {'plain/text': ['sample output',]},
                           'output': 'super'}],
              'source': 'awesome_code'})

clean_cell(tst1, clear_all=True)
test_eq(tst1, {'cell_type': 'code',
               'execution_count': None,
               'metadata': {},
               'outputs': [],
               'source': 'awesome_code'})
tst2 = {
       'metadata': {'tags':[]},
       'outputs': [{
                    'metadata': {
                        'tags':[]
                    }}],
       
          "source": [
    ""
   ]}
clean_cell(tst2, clear_all=False)
test_eq(tst2, {
               'metadata': {},
               'outputs': [{
                    'metadata':{}}],
               'source': []})

clean_nb[source]

clean_nb(nb, clear_all=False)

Clean nb from superfluous metadata, passing clear_all to clean_cell

tst = {'cell_type': 'code',
       'execution_count': 26,
       'metadata': {'hide_input': True, 'meta': 23},
       'outputs': [{'execution_count': 2,
                    'data': {
                        'application/vnd.google.colaboratory.intrinsic+json': {
                            'type': 'string'},
                        'plain/text': ['sample output',]
                    },
                    'output': 'super'}],
       'source': 'awesome_code'}
nb = {'metadata': {'kernelspec': 'some_spec', 'jekyll': 'some_meta', 'meta': 37},
      'cells': [tst]}

clean_nb(nb)
test_eq(nb['cells'][0], {'cell_type': 'code',
              'execution_count': None,
              'metadata': {'hide_input': True},
              'outputs': [{'execution_count': None, 
                           'data': { 'plain/text': ['sample output',]},
                           'output': 'super'}],
              'source': 'awesome_code'})
test_eq(nb['metadata'], {'kernelspec': 'some_spec', 'jekyll': 'some_meta'})

Main function

nbdev_clean_nbs[source]

nbdev_clean_nbs(fname:str=None, clear_all:bool_arg=False, disp:bool_arg=False, read_input_stream:bool_arg=False)

Clean all notebooks in fname to avoid merge conflicts

Type Default Details
fname str None A notebook name or glob to convert
clear_all bool_arg False Clean all metadata and outputs
disp bool_arg False Print the cleaned outputs
read_input_stream bool_arg False Read input stram and not nb folder

By default (fname left to None), the all the notebooks in lib_folder are cleaned. You can opt in to fully clean the notebook by removing every bit of metadata and the cell outputs by passing clear_all=True. disp is only used for internal use with git hooks and will print the clean notebook instead of saving it. Same for read_input_stream that will read the notebook from the input stream instead of the file names.