The nmrstarlib Tutorial

The nmrstarlib package provides classes and other facilities for parsing, accessing, and manipulating data stored in NMR-STAR and JSONized NMR-STAR formats. Also, the nmrstarlib package provides simple command-line interface.

Using nmrstarlib as a library

Importing nmrstarlib package

If the nmrstarlib package is installed on the system, it can be imported:

In [1]:
import nmrstarlib

Constructing StarFile generator

The nmrstarlib module provides the read_files() generator function that yields StarFile instances. Constructing a StarFile generator is easy - specify the path to a local NMR-STAR file, directory of NMR-STAR files, archive of NMR-STAR files or BMRB id:

In [2]:
import nmrstarlib

single_starfile = nmrstarlib.read_files("bmr18569.str")  # single NMR-STAR file
starfiles = nmrstarlib.read_files("bmr18569.str", "bmr336.str") # several NMR-STAR files
dir_starfiles = nmrstarlib.read_files("starfiles_dir")   # directory of NMR-STAR files
arch_starfiles = nmrstarlib.read_files("starfiles.zip")  # archive of NMR-STAR files
url_starfile = nmrstarlib.read_files("18569")            # BMRB id of NMR-STAR file

Processing StarFile generator

The StarFile generator can be processed in several ways:

  • Feed it to a for-loop and process one file at a time:
In [3]:
for starfile in nmrstarlib.read_files("18569", "15000"):
    print("BMRB id:", starfile.bmrbid)      # print BMRB id of StarFile
    print("File source:", starfile.source)  # print source of StarFile
    for saveframe_name in starfile.keys():  # print saveframe names
        print("\t", saveframe_name)
BMRB id: 18569
File source: http://rest.bmrb.wisc.edu/bmrb/NMR-STAR3/18569
         data
         comment_0
         save_entry_information
         comment_1
         save_entry_citation
         comment_2
         save_assembly
         comment_3
         save_EVH1
         comment_4
         save_natural_source
         comment_5
         save_experimental_source
         comment_6
         comment_7
         save_sample_1
         save_sample_2
         save_sample_3
         save_sample_4
         comment_8
         save_sample_conditions_1
         save_sample_conditions_2
         save_sample_conditions_3
         save_sample_conditions_4
         comment_9
         save_AZARA
         save_xwinnmr
         save_ANSIG
         save_CNS
         comment_10
         comment_11
         save_spectrometer_1
         save_spectrometer_2
         save_NMR_spectrometer_list
         comment_12
         save_experiment_list
         comment_13
         comment_14
         comment_15
         save_chemical_shift_reference_1
         comment_16
         comment_17
         save_assigned_chem_shift_list_1
         comment_18
         save_combined_NOESY_peak_list
BMRB id: 15000
File source: http://rest.bmrb.wisc.edu/bmrb/NMR-STAR3/15000
         data
         comment_0
         save_entry_information
         comment_1
         save_citation_1
         comment_2
         save_assembly
         comment_3
         save_F5-Phe-cVHP
         comment_4
         save_natural_source
         comment_5
         save_experimental_source
         comment_6
         save_chem_comp_PHF
         comment_7
         comment_8
         save_unlabeled_sample
         save_selectively_labeled_sample
         comment_9
         save_sample_conditions
         comment_10
         save_NMRPipe
         save_PIPP
         save_SPARKY
         save_CYANA
         save_X-PLOR_NIH
         comment_11
         comment_12
         save_spectrometer_1
         save_spectrometer_2
         save_spectrometer_3
         save_spectrometer_4
         save_spectrometer_5
         save_spectrometer_6
         save_NMR_spectrometer_list
         comment_13
         save_experiment_list
         comment_14
         comment_15
         comment_16
         save_chemical_shift_reference_1
         comment_17
         comment_18
         save_assigned_chem_shift_list_1

Note

Once the generator is consumed, it becomes empty and needs to be created again.

  • Since the StarFile generator behaves like an iterator, we can call the next() built-in function:
In [4]:
sf_generator = nmrstarlib.read_files("18569", "15000")

starfile1 = next(sf_generator)
starfile2 = next(sf_generator)

Note

Once the generator is consumed, it becomes empty and needs to be created again.

In [5]:
starfiles_list = list(nmrstarlib.read_files("18569", "15000"))

Warning

Do not convert the StarFile generator into a list if the generator can yield a large number of files, e.g. several thousand, otherwise it can consume all available memory.

Accessing and manipulating data from a single StarFile

Since a StarFile is a Python collections.OrderedDict, data can be accessed and manipulated as with any regular Python dict object using bracket accessors.

In [7]:
starfile = next(nmrstarlib.read_files("15000"))

# list StarFile-level keys, i.e. saveframe names
list(starfile.keys())
Out[7]:
['data',
 'comment_0',
 'save_entry_information',
 'comment_1',
 'save_citation_1',
 'comment_2',
 'save_assembly',
 'comment_3',
 'save_F5-Phe-cVHP',
 'comment_4',
 'save_natural_source',
 'comment_5',
 'save_experimental_source',
 'comment_6',
 'save_chem_comp_PHF',
 'comment_7',
 'comment_8',
 'save_unlabeled_sample',
 'save_selectively_labeled_sample',
 'comment_9',
 'save_sample_conditions',
 'comment_10',
 'save_NMRPipe',
 'save_PIPP',
 'save_SPARKY',
 'save_CYANA',
 'save_X-PLOR_NIH',
 'comment_11',
 'comment_12',
 'save_spectrometer_1',
 'save_spectrometer_2',
 'save_spectrometer_3',
 'save_spectrometer_4',
 'save_spectrometer_5',
 'save_spectrometer_6',
 'save_NMR_spectrometer_list',
 'comment_13',
 'save_experiment_list',
 'comment_14',
 'comment_15',
 'comment_16',
 'save_chemical_shift_reference_1',
 'comment_17',
 'comment_18',
 'save_assigned_chem_shift_list_1']
In [8]:
# access "data" field
starfile["data"]
Out[8]:
'15000'
In [9]:
# access saveframe
starfile["save_entry_information"]
Out[9]:
OrderedDict([('Entry.Sf_category', 'entry_information'),
             ('Entry.Sf_framecode', 'entry_information'),
             ('Entry.ID', '15000'),
             ('Entry.Title',
              'Solution structure of chicken villin headpiece subdomain containing a fluorinated side chain in the core\n'),
             ('Entry.Type', 'macromolecule'),
             ('Entry.Version_type', 'original'),
             ('Entry.Submission_date', '2006-09-07'),
             ('Entry.Accession_date', '2006-09-07'),
             ('Entry.Last_release_date', '.'),
             ('Entry.Original_release_date', '.'),
             ('Entry.Origination', 'author'),
             ('Entry.NMR_STAR_version', '3.1.1.61'),
             ('Entry.Original_NMR_STAR_version', '.'),
             ('Entry.Experimental_method', 'NMR'),
             ('Entry.Experimental_method_subtype', 'solution'),
             ('Entry.Details', '.'),
             ('Entry.BMRB_internal_directory_name', '.'),
             ('loop_0',
              (['Entry_author.Ordinal',
                'Entry_author.Given_name',
                'Entry_author.Family_name',
                'Entry_author.First_initial',
                'Entry_author.Middle_initials',
                'Entry_author.Family_title',
                'Entry_author.Entry_ID'],
               [OrderedDict([('Entry_author.Ordinal', '1'),
                             ('Entry_author.Given_name', 'Claudia'),
                             ('Entry_author.Family_name', 'Cornilescu'),
                             ('Entry_author.First_initial', '.'),
                             ('Entry_author.Middle_initials', 'C.'),
                             ('Entry_author.Family_title', '.'),
                             ('Entry_author.Entry_ID', '15000')]),
                OrderedDict([('Entry_author.Ordinal', '2'),
                             ('Entry_author.Given_name', 'Gabriel'),
                             ('Entry_author.Family_name', 'Cornilescu'),
                             ('Entry_author.First_initial', '.'),
                             ('Entry_author.Middle_initials', '.'),
                             ('Entry_author.Family_title', '.'),
                             ('Entry_author.Entry_ID', '15000')]),
                OrderedDict([('Entry_author.Ordinal', '3'),
                             ('Entry_author.Given_name', 'Erik'),
                             ('Entry_author.Family_name', 'Hadley'),
                             ('Entry_author.First_initial', '.'),
                             ('Entry_author.Middle_initials', 'B.'),
                             ('Entry_author.Family_title', '.'),
                             ('Entry_author.Entry_ID', '15000')]),
                OrderedDict([('Entry_author.Ordinal', '4'),
                             ('Entry_author.Given_name', 'Samuel'),
                             ('Entry_author.Family_name', 'Gellman'),
                             ('Entry_author.First_initial', '.'),
                             ('Entry_author.Middle_initials', 'H.'),
                             ('Entry_author.Family_title', '.'),
                             ('Entry_author.Entry_ID', '15000')]),
                OrderedDict([('Entry_author.Ordinal', '5'),
                             ('Entry_author.Given_name', 'John'),
                             ('Entry_author.Family_name', 'Markley'),
                             ('Entry_author.First_initial', '.'),
                             ('Entry_author.Middle_initials', 'L.'),
                             ('Entry_author.Family_title', '.'),
                             ('Entry_author.Entry_ID', '15000')])])),
             ('loop_1',
              (['SG_project.SG_project_ID',
                'SG_project.Project_name',
                'SG_project.Full_name_of_center',
                'SG_project.Initial_of_center',
                'SG_project.Entry_ID'],
               [OrderedDict([('SG_project.SG_project_ID', '1'),
                             ('SG_project.Project_name', 'not applicable'),
                             ('SG_project.Full_name_of_center',
                              'not applicable'),
                             ('SG_project.Initial_of_center', '.'),
                             ('SG_project.Entry_ID', '15000')])])),
             ('loop_2',
              (['Struct_keywords.Keywords',
                'Struct_keywords.Text',
                'Struct_keywords.Entry_ID'],
               [OrderedDict([('Struct_keywords.Keywords',
                              'chicken villin headpiece'),
                             ('Struct_keywords.Text', '.'),
                             ('Struct_keywords.Entry_ID', '15000')]),
                OrderedDict([('Struct_keywords.Keywords', 'fluorinated Phe'),
                             ('Struct_keywords.Text', '.'),
                             ('Struct_keywords.Entry_ID', '15000')]),
                OrderedDict([('Struct_keywords.Keywords', 'VHP'),
                             ('Struct_keywords.Text', '.'),
                             ('Struct_keywords.Entry_ID', '15000')])])),
             ('loop_3',
              (['Data_set.Type', 'Data_set.Count', 'Data_set.Entry_ID'],
               [OrderedDict([('Data_set.Type', 'assigned_chemical_shifts'),
                             ('Data_set.Count', '1'),
                             ('Data_set.Entry_ID', '15000')])])),
             ('loop_4',
              (['Datum.Type', 'Datum.Count', 'Datum.Entry_ID'],
               [OrderedDict([('Datum.Type', '13C chemical shifts'),
                             ('Datum.Count', '77'),
                             ('Datum.Entry_ID', '15000')]),
                OrderedDict([('Datum.Type', '15N chemical shifts'),
                             ('Datum.Count', '40'),
                             ('Datum.Entry_ID', '15000')]),
                OrderedDict([('Datum.Type', '1H chemical shifts'),
                             ('Datum.Count', '223'),
                             ('Datum.Entry_ID', '15000')])])),
             ('loop_5',
              (['Release.Release_number',
                'Release.Format_type',
                'Release.Format_version',
                'Release.Date',
                'Release.Submission_date',
                'Release.Type',
                'Release.Author',
                'Release.Detail',
                'Release.Entry_ID'],
               [OrderedDict([('Release.Release_number', '2'),
                             ('Release.Format_type', '.'),
                             ('Release.Format_version', '.'),
                             ('Release.Date', '2008-07-17'),
                             ('Release.Submission_date', '2006-09-06'),
                             ('Release.Type', 'update'),
                             ('Release.Author', 'BMRB'),
                             ('Release.Detail', 'complete entry citation'),
                             ('Release.Entry_ID', '15000')]),
                OrderedDict([('Release.Release_number', '1'),
                             ('Release.Format_type', '.'),
                             ('Release.Format_version', '.'),
                             ('Release.Date', '2006-10-20'),
                             ('Release.Submission_date', '2006-09-06'),
                             ('Release.Type', 'original'),
                             ('Release.Author', 'author'),
                             ('Release.Detail', 'original release'),
                             ('Release.Entry_ID', '15000')])])),
             ('loop_6',
              (['Related_entries.Database_name',
                'Related_entries.Database_accession_code',
                'Related_entries.Relationship',
                'Related_entries.Entry_ID'],
               [OrderedDict([('Related_entries.Database_name', 'PDB'),
                             ('Related_entries.Database_accession_code',
                              '2JM0'),
                             ('Related_entries.Relationship',
                              'BMRB Entry Tracking System'),
                             ('Related_entries.Entry_ID', '15000')])]))])
In [10]:
# list saveframe-level keys
list(starfile["save_entry_information"].keys())
Out[10]:
['Entry.Sf_category',
 'Entry.Sf_framecode',
 'Entry.ID',
 'Entry.Title',
 'Entry.Type',
 'Entry.Version_type',
 'Entry.Submission_date',
 'Entry.Accession_date',
 'Entry.Last_release_date',
 'Entry.Original_release_date',
 'Entry.Origination',
 'Entry.NMR_STAR_version',
 'Entry.Original_NMR_STAR_version',
 'Entry.Experimental_method',
 'Entry.Experimental_method_subtype',
 'Entry.Details',
 'Entry.BMRB_internal_directory_name',
 'loop_0',
 'loop_1',
 'loop_2',
 'loop_3',
 'loop_4',
 'loop_5',
 'loop_6']
In [11]:
# access 'key-value' pairs within saveframes
starfile["save_entry_information"]["Entry.Submission_date"]
Out[11]:
'2006-09-07'
In [12]:
# access loops
starfile["save_entry_information"]["loop_0"]
Out[12]:
(['Entry_author.Ordinal',
  'Entry_author.Given_name',
  'Entry_author.Family_name',
  'Entry_author.First_initial',
  'Entry_author.Middle_initials',
  'Entry_author.Family_title',
  'Entry_author.Entry_ID'],
 [OrderedDict([('Entry_author.Ordinal', '1'),
               ('Entry_author.Given_name', 'Claudia'),
               ('Entry_author.Family_name', 'Cornilescu'),
               ('Entry_author.First_initial', '.'),
               ('Entry_author.Middle_initials', 'C.'),
               ('Entry_author.Family_title', '.'),
               ('Entry_author.Entry_ID', '15000')]),
  OrderedDict([('Entry_author.Ordinal', '2'),
               ('Entry_author.Given_name', 'Gabriel'),
               ('Entry_author.Family_name', 'Cornilescu'),
               ('Entry_author.First_initial', '.'),
               ('Entry_author.Middle_initials', '.'),
               ('Entry_author.Family_title', '.'),
               ('Entry_author.Entry_ID', '15000')]),
  OrderedDict([('Entry_author.Ordinal', '3'),
               ('Entry_author.Given_name', 'Erik'),
               ('Entry_author.Family_name', 'Hadley'),
               ('Entry_author.First_initial', '.'),
               ('Entry_author.Middle_initials', 'B.'),
               ('Entry_author.Family_title', '.'),
               ('Entry_author.Entry_ID', '15000')]),
  OrderedDict([('Entry_author.Ordinal', '4'),
               ('Entry_author.Given_name', 'Samuel'),
               ('Entry_author.Family_name', 'Gellman'),
               ('Entry_author.First_initial', '.'),
               ('Entry_author.Middle_initials', 'H.'),
               ('Entry_author.Family_title', '.'),
               ('Entry_author.Entry_ID', '15000')]),
  OrderedDict([('Entry_author.Ordinal', '5'),
               ('Entry_author.Given_name', 'John'),
               ('Entry_author.Family_name', 'Markley'),
               ('Entry_author.First_initial', '.'),
               ('Entry_author.Middle_initials', 'L.'),
               ('Entry_author.Family_title', '.'),
               ('Entry_author.Entry_ID', '15000')])])
In [13]:
# list loop-level fields
starfile["save_entry_information"]["loop_0"][0]
Out[13]:
['Entry_author.Ordinal',
 'Entry_author.Given_name',
 'Entry_author.Family_name',
 'Entry_author.First_initial',
 'Entry_author.Middle_initials',
 'Entry_author.Family_title',
 'Entry_author.Entry_ID']
In [14]:
# list loop-level values (list of dictionaries)
starfile["save_entry_information"]["loop_0"][1]
Out[14]:
[OrderedDict([('Entry_author.Ordinal', '1'),
              ('Entry_author.Given_name', 'Claudia'),
              ('Entry_author.Family_name', 'Cornilescu'),
              ('Entry_author.First_initial', '.'),
              ('Entry_author.Middle_initials', 'C.'),
              ('Entry_author.Family_title', '.'),
              ('Entry_author.Entry_ID', '15000')]),
 OrderedDict([('Entry_author.Ordinal', '2'),
              ('Entry_author.Given_name', 'Gabriel'),
              ('Entry_author.Family_name', 'Cornilescu'),
              ('Entry_author.First_initial', '.'),
              ('Entry_author.Middle_initials', '.'),
              ('Entry_author.Family_title', '.'),
              ('Entry_author.Entry_ID', '15000')]),
 OrderedDict([('Entry_author.Ordinal', '3'),
              ('Entry_author.Given_name', 'Erik'),
              ('Entry_author.Family_name', 'Hadley'),
              ('Entry_author.First_initial', '.'),
              ('Entry_author.Middle_initials', 'B.'),
              ('Entry_author.Family_title', '.'),
              ('Entry_author.Entry_ID', '15000')]),
 OrderedDict([('Entry_author.Ordinal', '4'),
              ('Entry_author.Given_name', 'Samuel'),
              ('Entry_author.Family_name', 'Gellman'),
              ('Entry_author.First_initial', '.'),
              ('Entry_author.Middle_initials', 'H.'),
              ('Entry_author.Family_title', '.'),
              ('Entry_author.Entry_ID', '15000')]),
 OrderedDict([('Entry_author.Ordinal', '5'),
              ('Entry_author.Given_name', 'John'),
              ('Entry_author.Family_name', 'Markley'),
              ('Entry_author.First_initial', '.'),
              ('Entry_author.Middle_initials', 'L.'),
              ('Entry_author.Family_title', '.'),
              ('Entry_author.Entry_ID', '15000')])]
In [15]:
# every loop entry is accessed by index
starfile["save_entry_information"]["loop_0"][1][0]["Entry_author.Family_name"]
Out[15]:
'Cornilescu'
  • Manipulating data in a StarFile is easy - access data using bracket accessors and set a new value:
In [16]:
# check submission date
starfile["save_entry_information"]["Entry.Submission_date"]
Out[16]:
'2006-09-07'
In [17]:
# change submission date
starfile["save_entry_information"]["Entry.Submission_date"] = "2015-07-05"
In [18]:
# check that submission date is updated
starfile["save_entry_information"]["Entry.Submission_date"]
Out[18]:
'2015-07-05'
  • Printing a StarFile and its components (saveframe and loop data):
In [19]:
starfile = next(nmrstarlib.read_files("bmr15000.str"))
In [20]:
starfile.print_file(file_format="nmrstar")
data_15000

#######################
#  Entry information  #
#######################

save_entry_information
   _Entry.Sf_category    entry_information
   _Entry.Sf_framecode   entry_information
   _Entry.ID     15000
   _Entry.Title
;
Solution structure of chicken villin headpiece subdomain containing a fluorinated side chain in the core
;
   _Entry.Type   macromolecule
   _Entry.Version_type   original
   _Entry.Submission_date        2006-09-07
   _Entry.Accession_date         2006-09-07
   _Entry.Last_release_date      .
   _Entry.Original_release_date  .
   _Entry.Origination    author
   _Entry.NMR_STAR_version       3.1.1.61
   _Entry.Original_NMR_STAR_version      .
   _Entry.Experimental_method    NMR
   _Entry.Experimental_method_subtype    solution
   _Entry.Details        .
   _Entry.BMRB_internal_directory_name   .

   loop_
      _Entry_author.Ordinal
      _Entry_author.Given_name
      _Entry_author.Family_name
      _Entry_author.First_initial
      _Entry_author.Middle_initials
      _Entry_author.Family_title
      _Entry_author.Entry_ID

      1 Claudia Cornilescu . C. . 15000
      2 Gabriel Cornilescu . . . 15000
      3 Erik Hadley . B. . 15000
      4 Samuel Gellman . H. . 15000
      5 John Markley . L. . 15000

   stop_

   loop_
      _Datum.Type
      _Datum.Count
      _Datum.Entry_ID

      '13C chemical shifts' 77 15000
      '15N chemical shifts' 40 15000
      '1H chemical shifts' 223 15000

   stop_

save_


save_assigned_chem_shift_list_1
   _Assigned_chem_shift_list.Sf_category         assigned_chemical_shifts
   _Assigned_chem_shift_list.Sf_framecode        assigned_chem_shift_list_1
   _Assigned_chem_shift_list.Entry_ID    15000
   _Assigned_chem_shift_list.ID  1
   _Assigned_chem_shift_list.Sample_condition_list_ID    1
   _Assigned_chem_shift_list.Sample_condition_list_label         $sample_conditions
   _Assigned_chem_shift_list.Chem_shift_reference_ID     1
   _Assigned_chem_shift_list.Chem_shift_reference_label  $chemical_shift_reference_1
   _Assigned_chem_shift_list.Chem_shift_1H_err   .
   _Assigned_chem_shift_list.Chem_shift_13C_err  .
   _Assigned_chem_shift_list.Chem_shift_15N_err  .
   _Assigned_chem_shift_list.Chem_shift_31P_err  .
   _Assigned_chem_shift_list.Chem_shift_2H_err   .
   _Assigned_chem_shift_list.Chem_shift_19F_err  .
   _Assigned_chem_shift_list.Error_derivation_method     .
   _Assigned_chem_shift_list.Details     .
   _Assigned_chem_shift_list.Text_data_format    .
   _Assigned_chem_shift_list.Text_data   .

   loop_
      _Atom_chem_shift.ID
      _Atom_chem_shift.Assembly_atom_ID
      _Atom_chem_shift.Entity_assembly_ID
      _Atom_chem_shift.Entity_ID
      _Atom_chem_shift.Comp_index_ID
      _Atom_chem_shift.Seq_ID
      _Atom_chem_shift.Comp_ID
      _Atom_chem_shift.Atom_ID
      _Atom_chem_shift.Atom_type
      _Atom_chem_shift.Atom_isotope_number
      _Atom_chem_shift.Val
      _Atom_chem_shift.Val_err
      _Atom_chem_shift.Assign_fig_of_merit
      _Atom_chem_shift.Ambiguity_code
      _Atom_chem_shift.Occupancy
      _Atom_chem_shift.Resonance_ID
      _Atom_chem_shift.Auth_entity_assembly_ID
      _Atom_chem_shift.Auth_asym_ID
      _Atom_chem_shift.Auth_seq_ID
      _Atom_chem_shift.Auth_comp_ID
      _Atom_chem_shift.Auth_atom_ID
      _Atom_chem_shift.Details
      _Atom_chem_shift.Entry_ID
      _Atom_chem_shift.Assigned_chem_shift_list_ID

      1 . 1 1 2 2 SER H H 1 9.3070 0.01 . . . . . . 2 SER H . 15000 1
      2 . 1 1 2 2 SER HA H 1 4.5970 0.01 . . . . . . 2 SER HA . 15000 1
      3 . 1 1 2 2 SER HB2 H 1 4.3010 0.01 . . . . . . 2 SER HB2 . 15000 1
      4 . 1 1 2 2 SER HB3 H 1 4.0550 0.01 . . . . . . 2 SER HB3 . 15000 1
      5 . 1 1 2 2 SER CB C 13 64.6000 0.1 . . . . . . 2 SER CB . 15000 1
      6 . 1 1 2 2 SER N N 15 121.5800 0.1 . . . . . . 2 SER N . 15000 1
      7 . 1 1 3 3 ASP H H 1 8.0740 0.01 . . . . . . 3 ASP H . 15000 1
      8 . 1 1 3 3 ASP HA H 1 4.5580 0.01 . . . . . . 3 ASP HA . 15000 1
      9 . 1 1 3 3 ASP HB2 H 1 2.835 0.01 . . . . . . 3 ASP HB2 . 15000 1
      10 . 1 1 3 3 ASP HB3 H 1 2.754 0.01 . . . . . . 3 ASP HB3 . 15000 1
      11 . 1 1 3 3 ASP CA C 13 57.6400 0.1 . . . . . . 3 ASP CA . 15000 1
      12 . 1 1 3 3 ASP N N 15 121.1040 0.1 . . . . . . 3 ASP N . 15000 1
      13 . 1 1 4 4 GLU H H 1 8.6520 0.01 . . . . . . 4 GLU H . 15000 1
      14 . 1 1 4 4 GLU HA H 1 4.1420 0.01 . . . . . . 4 GLU HA . 15000 1
      15 . 1 1 4 4 GLU HB2 H 1 2.0520 0.01 . . . . . . 4 GLU HB2 . 15000 1
      16 . 1 1 4 4 GLU HB3 H 1 2.0320 0.01 . . . . . . 4 GLU HB3 . 15000 1
      17 . 1 1 4 4 GLU HG2 H 1 2.4540 0.01 . . . . . . 4 GLU HG2 . 15000 1
      18 . 1 1 4 4 GLU CB C 13 28.1200 0.1 . . . . . . 4 GLU CB . 15000 1
      19 . 1 1 4 4 GLU CG C 13 33.2720 0.1 . . . . . . 4 GLU CG . 15000 1
      20 . 1 1 4 4 GLU N N 15 119.8900 0.1 . . . . . . 4 GLU N . 15000 1

   stop_

save_


In [21]:
starfile.print_file(file_format="json")
{
    "data": "15000",
    "comment_0": "#######################\n#  Entry information  #\n#######################\n",
    "save_entry_information": {
        "Entry.Sf_category": "entry_information",
        "Entry.Sf_framecode": "entry_information",
        "Entry.ID": "15000",
        "Entry.Title": "Solution structure of chicken villin headpiece subdomain containing a fluorinated side chain in the core\n",
        "Entry.Type": "macromolecule",
        "Entry.Version_type": "original",
        "Entry.Submission_date": "2006-09-07",
        "Entry.Accession_date": "2006-09-07",
        "Entry.Last_release_date": ".",
        "Entry.Original_release_date": ".",
        "Entry.Origination": "author",
        "Entry.NMR_STAR_version": "3.1.1.61",
        "Entry.Original_NMR_STAR_version": ".",
        "Entry.Experimental_method": "NMR",
        "Entry.Experimental_method_subtype": "solution",
        "Entry.Details": ".",
        "Entry.BMRB_internal_directory_name": ".",
        "loop_0": [
            [
                "Entry_author.Ordinal",
                "Entry_author.Given_name",
                "Entry_author.Family_name",
                "Entry_author.First_initial",
                "Entry_author.Middle_initials",
                "Entry_author.Family_title",
                "Entry_author.Entry_ID"
            ],
            [
                {
                    "Entry_author.Ordinal": "1",
                    "Entry_author.Given_name": "Claudia",
                    "Entry_author.Family_name": "Cornilescu",
                    "Entry_author.First_initial": ".",
                    "Entry_author.Middle_initials": "C.",
                    "Entry_author.Family_title": ".",
                    "Entry_author.Entry_ID": "15000"
                },
                {
                    "Entry_author.Ordinal": "2",
                    "Entry_author.Given_name": "Gabriel",
                    "Entry_author.Family_name": "Cornilescu",
                    "Entry_author.First_initial": ".",
                    "Entry_author.Middle_initials": ".",
                    "Entry_author.Family_title": ".",
                    "Entry_author.Entry_ID": "15000"
                },
                {
                    "Entry_author.Ordinal": "3",
                    "Entry_author.Given_name": "Erik",
                    "Entry_author.Family_name": "Hadley",
                    "Entry_author.First_initial": ".",
                    "Entry_author.Middle_initials": "B.",
                    "Entry_author.Family_title": ".",
                    "Entry_author.Entry_ID": "15000"
                },
                {
                    "Entry_author.Ordinal": "4",
                    "Entry_author.Given_name": "Samuel",
                    "Entry_author.Family_name": "Gellman",
                    "Entry_author.First_initial": ".",
                    "Entry_author.Middle_initials": "H.",
                    "Entry_author.Family_title": ".",
                    "Entry_author.Entry_ID": "15000"
                },
                {
                    "Entry_author.Ordinal": "5",
                    "Entry_author.Given_name": "John",
                    "Entry_author.Family_name": "Markley",
                    "Entry_author.First_initial": ".",
                    "Entry_author.Middle_initials": "L.",
                    "Entry_author.Family_title": ".",
                    "Entry_author.Entry_ID": "15000"
                }
            ]
        ],
        "loop_1": [
            [
                "Datum.Type",
                "Datum.Count",
                "Datum.Entry_ID"
            ],
            [
                {
                    "Datum.Type": "13C chemical shifts",
                    "Datum.Count": "77",
                    "Datum.Entry_ID": "15000"
                },
                {
                    "Datum.Type": "15N chemical shifts",
                    "Datum.Count": "40",
                    "Datum.Entry_ID": "15000"
                },
                {
                    "Datum.Type": "1H chemical shifts",
                    "Datum.Count": "223",
                    "Datum.Entry_ID": "15000"
                }
            ]
        ]
    },
    "save_assigned_chem_shift_list_1": {
        "Assigned_chem_shift_list.Sf_category": "assigned_chemical_shifts",
        "Assigned_chem_shift_list.Sf_framecode": "assigned_chem_shift_list_1",
        "Assigned_chem_shift_list.Entry_ID": "15000",
        "Assigned_chem_shift_list.ID": "1",
        "Assigned_chem_shift_list.Sample_condition_list_ID": "1",
        "Assigned_chem_shift_list.Sample_condition_list_label": "$sample_conditions",
        "Assigned_chem_shift_list.Chem_shift_reference_ID": "1",
        "Assigned_chem_shift_list.Chem_shift_reference_label": "$chemical_shift_reference_1",
        "Assigned_chem_shift_list.Chem_shift_1H_err": ".",
        "Assigned_chem_shift_list.Chem_shift_13C_err": ".",
        "Assigned_chem_shift_list.Chem_shift_15N_err": ".",
        "Assigned_chem_shift_list.Chem_shift_31P_err": ".",
        "Assigned_chem_shift_list.Chem_shift_2H_err": ".",
        "Assigned_chem_shift_list.Chem_shift_19F_err": ".",
        "Assigned_chem_shift_list.Error_derivation_method": ".",
        "Assigned_chem_shift_list.Details": ".",
        "Assigned_chem_shift_list.Text_data_format": ".",
        "Assigned_chem_shift_list.Text_data": ".",
        "loop_0": [
            [
                "Atom_chem_shift.ID",
                "Atom_chem_shift.Assembly_atom_ID",
                "Atom_chem_shift.Entity_assembly_ID",
                "Atom_chem_shift.Entity_ID",
                "Atom_chem_shift.Comp_index_ID",
                "Atom_chem_shift.Seq_ID",
                "Atom_chem_shift.Comp_ID",
                "Atom_chem_shift.Atom_ID",
                "Atom_chem_shift.Atom_type",
                "Atom_chem_shift.Atom_isotope_number",
                "Atom_chem_shift.Val",
                "Atom_chem_shift.Val_err",
                "Atom_chem_shift.Assign_fig_of_merit",
                "Atom_chem_shift.Ambiguity_code",
                "Atom_chem_shift.Occupancy",
                "Atom_chem_shift.Resonance_ID",
                "Atom_chem_shift.Auth_entity_assembly_ID",
                "Atom_chem_shift.Auth_asym_ID",
                "Atom_chem_shift.Auth_seq_ID",
                "Atom_chem_shift.Auth_comp_ID",
                "Atom_chem_shift.Auth_atom_ID",
                "Atom_chem_shift.Details",
                "Atom_chem_shift.Entry_ID",
                "Atom_chem_shift.Assigned_chem_shift_list_ID"
            ],
            [
                {
                    "Atom_chem_shift.ID": "1",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "2",
                    "Atom_chem_shift.Seq_ID": "2",
                    "Atom_chem_shift.Comp_ID": "SER",
                    "Atom_chem_shift.Atom_ID": "H",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "9.3070",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "2",
                    "Atom_chem_shift.Auth_comp_ID": "SER",
                    "Atom_chem_shift.Auth_atom_ID": "H",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "2",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "2",
                    "Atom_chem_shift.Seq_ID": "2",
                    "Atom_chem_shift.Comp_ID": "SER",
                    "Atom_chem_shift.Atom_ID": "HA",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "4.5970",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "2",
                    "Atom_chem_shift.Auth_comp_ID": "SER",
                    "Atom_chem_shift.Auth_atom_ID": "HA",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "3",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "2",
                    "Atom_chem_shift.Seq_ID": "2",
                    "Atom_chem_shift.Comp_ID": "SER",
                    "Atom_chem_shift.Atom_ID": "HB2",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "4.3010",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "2",
                    "Atom_chem_shift.Auth_comp_ID": "SER",
                    "Atom_chem_shift.Auth_atom_ID": "HB2",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "4",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "2",
                    "Atom_chem_shift.Seq_ID": "2",
                    "Atom_chem_shift.Comp_ID": "SER",
                    "Atom_chem_shift.Atom_ID": "HB3",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "4.0550",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "2",
                    "Atom_chem_shift.Auth_comp_ID": "SER",
                    "Atom_chem_shift.Auth_atom_ID": "HB3",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "5",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "2",
                    "Atom_chem_shift.Seq_ID": "2",
                    "Atom_chem_shift.Comp_ID": "SER",
                    "Atom_chem_shift.Atom_ID": "CB",
                    "Atom_chem_shift.Atom_type": "C",
                    "Atom_chem_shift.Atom_isotope_number": "13",
                    "Atom_chem_shift.Val": "64.6000",
                    "Atom_chem_shift.Val_err": "0.1",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "2",
                    "Atom_chem_shift.Auth_comp_ID": "SER",
                    "Atom_chem_shift.Auth_atom_ID": "CB",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "6",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "2",
                    "Atom_chem_shift.Seq_ID": "2",
                    "Atom_chem_shift.Comp_ID": "SER",
                    "Atom_chem_shift.Atom_ID": "N",
                    "Atom_chem_shift.Atom_type": "N",
                    "Atom_chem_shift.Atom_isotope_number": "15",
                    "Atom_chem_shift.Val": "121.5800",
                    "Atom_chem_shift.Val_err": "0.1",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "2",
                    "Atom_chem_shift.Auth_comp_ID": "SER",
                    "Atom_chem_shift.Auth_atom_ID": "N",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "7",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "3",
                    "Atom_chem_shift.Seq_ID": "3",
                    "Atom_chem_shift.Comp_ID": "ASP",
                    "Atom_chem_shift.Atom_ID": "H",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "8.0740",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "3",
                    "Atom_chem_shift.Auth_comp_ID": "ASP",
                    "Atom_chem_shift.Auth_atom_ID": "H",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "8",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "3",
                    "Atom_chem_shift.Seq_ID": "3",
                    "Atom_chem_shift.Comp_ID": "ASP",
                    "Atom_chem_shift.Atom_ID": "HA",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "4.5580",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "3",
                    "Atom_chem_shift.Auth_comp_ID": "ASP",
                    "Atom_chem_shift.Auth_atom_ID": "HA",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "9",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "3",
                    "Atom_chem_shift.Seq_ID": "3",
                    "Atom_chem_shift.Comp_ID": "ASP",
                    "Atom_chem_shift.Atom_ID": "HB2",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "2.835",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "3",
                    "Atom_chem_shift.Auth_comp_ID": "ASP",
                    "Atom_chem_shift.Auth_atom_ID": "HB2",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "10",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "3",
                    "Atom_chem_shift.Seq_ID": "3",
                    "Atom_chem_shift.Comp_ID": "ASP",
                    "Atom_chem_shift.Atom_ID": "HB3",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "2.754",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "3",
                    "Atom_chem_shift.Auth_comp_ID": "ASP",
                    "Atom_chem_shift.Auth_atom_ID": "HB3",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "11",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "3",
                    "Atom_chem_shift.Seq_ID": "3",
                    "Atom_chem_shift.Comp_ID": "ASP",
                    "Atom_chem_shift.Atom_ID": "CA",
                    "Atom_chem_shift.Atom_type": "C",
                    "Atom_chem_shift.Atom_isotope_number": "13",
                    "Atom_chem_shift.Val": "57.6400",
                    "Atom_chem_shift.Val_err": "0.1",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "3",
                    "Atom_chem_shift.Auth_comp_ID": "ASP",
                    "Atom_chem_shift.Auth_atom_ID": "CA",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "12",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "3",
                    "Atom_chem_shift.Seq_ID": "3",
                    "Atom_chem_shift.Comp_ID": "ASP",
                    "Atom_chem_shift.Atom_ID": "N",
                    "Atom_chem_shift.Atom_type": "N",
                    "Atom_chem_shift.Atom_isotope_number": "15",
                    "Atom_chem_shift.Val": "121.1040",
                    "Atom_chem_shift.Val_err": "0.1",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "3",
                    "Atom_chem_shift.Auth_comp_ID": "ASP",
                    "Atom_chem_shift.Auth_atom_ID": "N",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "13",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "H",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "8.6520",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "H",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "14",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "HA",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "4.1420",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "HA",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "15",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "HB2",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "2.0520",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "HB2",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "16",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "HB3",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "2.0320",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "HB3",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "17",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "HG2",
                    "Atom_chem_shift.Atom_type": "H",
                    "Atom_chem_shift.Atom_isotope_number": "1",
                    "Atom_chem_shift.Val": "2.4540",
                    "Atom_chem_shift.Val_err": "0.01",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "HG2",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "18",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "CB",
                    "Atom_chem_shift.Atom_type": "C",
                    "Atom_chem_shift.Atom_isotope_number": "13",
                    "Atom_chem_shift.Val": "28.1200",
                    "Atom_chem_shift.Val_err": "0.1",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "CB",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "19",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "CG",
                    "Atom_chem_shift.Atom_type": "C",
                    "Atom_chem_shift.Atom_isotope_number": "13",
                    "Atom_chem_shift.Val": "33.2720",
                    "Atom_chem_shift.Val_err": "0.1",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "CG",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                },
                {
                    "Atom_chem_shift.ID": "20",
                    "Atom_chem_shift.Assembly_atom_ID": ".",
                    "Atom_chem_shift.Entity_assembly_ID": "1",
                    "Atom_chem_shift.Entity_ID": "1",
                    "Atom_chem_shift.Comp_index_ID": "4",
                    "Atom_chem_shift.Seq_ID": "4",
                    "Atom_chem_shift.Comp_ID": "GLU",
                    "Atom_chem_shift.Atom_ID": "N",
                    "Atom_chem_shift.Atom_type": "N",
                    "Atom_chem_shift.Atom_isotope_number": "15",
                    "Atom_chem_shift.Val": "119.8900",
                    "Atom_chem_shift.Val_err": "0.1",
                    "Atom_chem_shift.Assign_fig_of_merit": ".",
                    "Atom_chem_shift.Ambiguity_code": ".",
                    "Atom_chem_shift.Occupancy": ".",
                    "Atom_chem_shift.Resonance_ID": ".",
                    "Atom_chem_shift.Auth_entity_assembly_ID": ".",
                    "Atom_chem_shift.Auth_asym_ID": ".",
                    "Atom_chem_shift.Auth_seq_ID": "4",
                    "Atom_chem_shift.Auth_comp_ID": "GLU",
                    "Atom_chem_shift.Auth_atom_ID": "N",
                    "Atom_chem_shift.Details": ".",
                    "Atom_chem_shift.Entry_ID": "15000",
                    "Atom_chem_shift.Assigned_chem_shift_list_ID": "1"
                }
            ]
        ]
    }
}
In [22]:
starfile.print_saveframe("save_entry_information", file_format="nmrstar")
   _Entry.Sf_category    entry_information
   _Entry.Sf_framecode   entry_information
   _Entry.ID     15000
   _Entry.Title
;
Solution structure of chicken villin headpiece subdomain containing a fluorinated side chain in the core
;
   _Entry.Type   macromolecule
   _Entry.Version_type   original
   _Entry.Submission_date        2006-09-07
   _Entry.Accession_date         2006-09-07
   _Entry.Last_release_date      .
   _Entry.Original_release_date  .
   _Entry.Origination    author
   _Entry.NMR_STAR_version       3.1.1.61
   _Entry.Original_NMR_STAR_version      .
   _Entry.Experimental_method    NMR
   _Entry.Experimental_method_subtype    solution
   _Entry.Details        .
   _Entry.BMRB_internal_directory_name   .

   loop_
      _Entry_author.Ordinal
      _Entry_author.Given_name
      _Entry_author.Family_name
      _Entry_author.First_initial
      _Entry_author.Middle_initials
      _Entry_author.Family_title
      _Entry_author.Entry_ID

      1 Claudia Cornilescu . C. . 15000
      2 Gabriel Cornilescu . . . 15000
      3 Erik Hadley . B. . 15000
      4 Samuel Gellman . H. . 15000
      5 John Markley . L. . 15000

   stop_

   loop_
      _Datum.Type
      _Datum.Count
      _Datum.Entry_ID

      '13C chemical shifts' 77 15000
      '15N chemical shifts' 40 15000
      '1H chemical shifts' 223 15000

   stop_
In [23]:
starfile.print_saveframe("save_entry_information", file_format="json")
{
    "Entry.Sf_category": "entry_information",
    "Entry.Sf_framecode": "entry_information",
    "Entry.ID": "15000",
    "Entry.Title": "Solution structure of chicken villin headpiece subdomain containing a fluorinated side chain in the core\n",
    "Entry.Type": "macromolecule",
    "Entry.Version_type": "original",
    "Entry.Submission_date": "2006-09-07",
    "Entry.Accession_date": "2006-09-07",
    "Entry.Last_release_date": ".",
    "Entry.Original_release_date": ".",
    "Entry.Origination": "author",
    "Entry.NMR_STAR_version": "3.1.1.61",
    "Entry.Original_NMR_STAR_version": ".",
    "Entry.Experimental_method": "NMR",
    "Entry.Experimental_method_subtype": "solution",
    "Entry.Details": ".",
    "Entry.BMRB_internal_directory_name": ".",
    "loop_0": [
        [
            "Entry_author.Ordinal",
            "Entry_author.Given_name",
            "Entry_author.Family_name",
            "Entry_author.First_initial",
            "Entry_author.Middle_initials",
            "Entry_author.Family_title",
            "Entry_author.Entry_ID"
        ],
        [
            {
                "Entry_author.Ordinal": "1",
                "Entry_author.Given_name": "Claudia",
                "Entry_author.Family_name": "Cornilescu",
                "Entry_author.First_initial": ".",
                "Entry_author.Middle_initials": "C.",
                "Entry_author.Family_title": ".",
                "Entry_author.Entry_ID": "15000"
            },
            {
                "Entry_author.Ordinal": "2",
                "Entry_author.Given_name": "Gabriel",
                "Entry_author.Family_name": "Cornilescu",
                "Entry_author.First_initial": ".",
                "Entry_author.Middle_initials": ".",
                "Entry_author.Family_title": ".",
                "Entry_author.Entry_ID": "15000"
            },
            {
                "Entry_author.Ordinal": "3",
                "Entry_author.Given_name": "Erik",
                "Entry_author.Family_name": "Hadley",
                "Entry_author.First_initial": ".",
                "Entry_author.Middle_initials": "B.",
                "Entry_author.Family_title": ".",
                "Entry_author.Entry_ID": "15000"
            },
            {
                "Entry_author.Ordinal": "4",
                "Entry_author.Given_name": "Samuel",
                "Entry_author.Family_name": "Gellman",
                "Entry_author.First_initial": ".",
                "Entry_author.Middle_initials": "H.",
                "Entry_author.Family_title": ".",
                "Entry_author.Entry_ID": "15000"
            },
            {
                "Entry_author.Ordinal": "5",
                "Entry_author.Given_name": "John",
                "Entry_author.Family_name": "Markley",
                "Entry_author.First_initial": ".",
                "Entry_author.Middle_initials": "L.",
                "Entry_author.Family_title": ".",
                "Entry_author.Entry_ID": "15000"
            }
        ]
    ],
    "loop_1": [
        [
            "Datum.Type",
            "Datum.Count",
            "Datum.Entry_ID"
        ],
        [
            {
                "Datum.Type": "13C chemical shifts",
                "Datum.Count": "77",
                "Datum.Entry_ID": "15000"
            },
            {
                "Datum.Type": "15N chemical shifts",
                "Datum.Count": "40",
                "Datum.Entry_ID": "15000"
            },
            {
                "Datum.Type": "1H chemical shifts",
                "Datum.Count": "223",
                "Datum.Entry_ID": "15000"
            }
        ]
    ]
}
In [24]:
starfile.print_loop("save_entry_information", "loop_0", file_format="nmrstar")
   _Entry_author.Ordinal
   _Entry_author.Given_name
   _Entry_author.Family_name
   _Entry_author.First_initial
   _Entry_author.Middle_initials
   _Entry_author.Family_title
   _Entry_author.Entry_ID

   1 Claudia Cornilescu . C. . 15000
   2 Gabriel Cornilescu . . . 15000
   3 Erik Hadley . B. . 15000
   4 Samuel Gellman . H. . 15000
   5 John Markley . L. . 15000
In [25]:
starfile.print_loop("save_entry_information", "loop_0", file_format="json")
[
    [
        "Entry_author.Ordinal",
        "Entry_author.Given_name",
        "Entry_author.Family_name",
        "Entry_author.First_initial",
        "Entry_author.Middle_initials",
        "Entry_author.Family_title",
        "Entry_author.Entry_ID"
    ],
    [
        {
            "Entry_author.Ordinal": "1",
            "Entry_author.Given_name": "Claudia",
            "Entry_author.Family_name": "Cornilescu",
            "Entry_author.First_initial": ".",
            "Entry_author.Middle_initials": "C.",
            "Entry_author.Family_title": ".",
            "Entry_author.Entry_ID": "15000"
        },
        {
            "Entry_author.Ordinal": "2",
            "Entry_author.Given_name": "Gabriel",
            "Entry_author.Family_name": "Cornilescu",
            "Entry_author.First_initial": ".",
            "Entry_author.Middle_initials": ".",
            "Entry_author.Family_title": ".",
            "Entry_author.Entry_ID": "15000"
        },
        {
            "Entry_author.Ordinal": "3",
            "Entry_author.Given_name": "Erik",
            "Entry_author.Family_name": "Hadley",
            "Entry_author.First_initial": ".",
            "Entry_author.Middle_initials": "B.",
            "Entry_author.Family_title": ".",
            "Entry_author.Entry_ID": "15000"
        },
        {
            "Entry_author.Ordinal": "4",
            "Entry_author.Given_name": "Samuel",
            "Entry_author.Family_name": "Gellman",
            "Entry_author.First_initial": ".",
            "Entry_author.Middle_initials": "H.",
            "Entry_author.Family_title": ".",
            "Entry_author.Entry_ID": "15000"
        },
        {
            "Entry_author.Ordinal": "5",
            "Entry_author.Given_name": "John",
            "Entry_author.Family_name": "Markley",
            "Entry_author.First_initial": ".",
            "Entry_author.Middle_initials": "L.",
            "Entry_author.Family_title": ".",
            "Entry_author.Entry_ID": "15000"
        }
    ]
]
  • Accessing chemical shift data:

    Chemical shift data can be accessed using bracket accessors as described above using a saveframe name and loop name:

In [26]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][0]
Out[26]:
['Atom_chem_shift.ID',
 'Atom_chem_shift.Assembly_atom_ID',
 'Atom_chem_shift.Entity_assembly_ID',
 'Atom_chem_shift.Entity_ID',
 'Atom_chem_shift.Comp_index_ID',
 'Atom_chem_shift.Seq_ID',
 'Atom_chem_shift.Comp_ID',
 'Atom_chem_shift.Atom_ID',
 'Atom_chem_shift.Atom_type',
 'Atom_chem_shift.Atom_isotope_number',
 'Atom_chem_shift.Val',
 'Atom_chem_shift.Val_err',
 'Atom_chem_shift.Assign_fig_of_merit',
 'Atom_chem_shift.Ambiguity_code',
 'Atom_chem_shift.Occupancy',
 'Atom_chem_shift.Resonance_ID',
 'Atom_chem_shift.Auth_entity_assembly_ID',
 'Atom_chem_shift.Auth_asym_ID',
 'Atom_chem_shift.Auth_seq_ID',
 'Atom_chem_shift.Auth_comp_ID',
 'Atom_chem_shift.Auth_atom_ID',
 'Atom_chem_shift.Details',
 'Atom_chem_shift.Entry_ID',
 'Atom_chem_shift.Assigned_chem_shift_list_ID']
In [27]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][0]["Atom_chem_shift.Seq_ID"]
Out[27]:
'2'
In [28]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][0]["Atom_chem_shift.Comp_ID"]
Out[28]:
'SER'
In [29]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][0]["Atom_chem_shift.Atom_ID"]
Out[29]:
'H'
In [30]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][0]["Atom_chem_shift.Val"]
Out[30]:
'9.3070'
In [31]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][1]["Atom_chem_shift.Atom_ID"]
Out[31]:
'HA'
In [32]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][1]["Atom_chem_shift.Val"]
Out[32]:
'4.5970'
In [33]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][2]["Atom_chem_shift.Atom_ID"]
Out[33]:
'HB2'
In [34]:
starfile["save_assigned_chem_shift_list_1"]["loop_0"][1][2]["Atom_chem_shift.Val"]
Out[34]:
'4.3010'

Also the StarFile class provides a chem_shifts_by_residue() method that organizes chemical shits into a list of collections.OrderedDict data structures (keys - sequence id, values - chemical shift data) - one for each protein chain, if multiple chains are present within the file:

In [35]:
# access all chemical shifts
starfile.chem_shifts_by_residue()
Out[35]:
[OrderedDict([('2',
               OrderedDict([('AA3Code', 'SER'),
                            ('Seq_ID', '2'),
                            ('H', '9.3070'),
                            ('HA', '4.5970'),
                            ('HB2', '4.3010'),
                            ('HB3', '4.0550'),
                            ('CB', '64.6000'),
                            ('N', '121.5800')])),
              ('3',
               OrderedDict([('AA3Code', 'ASP'),
                            ('Seq_ID', '3'),
                            ('H', '8.0740'),
                            ('HA', '4.5580'),
                            ('HB2', '2.835'),
                            ('HB3', '2.754'),
                            ('CA', '57.6400'),
                            ('N', '121.1040')])),
              ('4',
               OrderedDict([('AA3Code', 'GLU'),
                            ('Seq_ID', '4'),
                            ('H', '8.6520'),
                            ('HA', '4.1420'),
                            ('HB2', '2.0520'),
                            ('HB3', '2.0320'),
                            ('HG2', '2.4540'),
                            ('CB', '28.1200'),
                            ('CG', '33.2720'),
                            ('N', '119.8900')]))])]
In [36]:
# access chemical shifts for "SER" and "GLU" amino acids
starfile.chem_shifts_by_residue(amino_acids=["SER", "GLU"])
Out[36]:
[OrderedDict([('2',
               OrderedDict([('AA3Code', 'SER'),
                            ('Seq_ID', '2'),
                            ('H', '9.3070'),
                            ('HA', '4.5970'),
                            ('HB2', '4.3010'),
                            ('HB3', '4.0550'),
                            ('CB', '64.6000'),
                            ('N', '121.5800')])),
              ('4',
               OrderedDict([('AA3Code', 'GLU'),
                            ('Seq_ID', '4'),
                            ('H', '8.6520'),
                            ('HA', '4.1420'),
                            ('HB2', '2.0520'),
                            ('HB3', '2.0320'),
                            ('HG2', '2.4540'),
                            ('CB', '28.1200'),
                            ('CG', '33.2720'),
                            ('N', '119.8900')]))])]
In [37]:
# access chemical shifts for "SER" and "GLU" amino acids for "CB" and "CG" atoms
starfile.chem_shifts_by_residue(amino_acids=["SER", "GLU"], atoms=["CB", "CG"])
Out[37]:
[OrderedDict([('2',
               OrderedDict([('AA3Code', 'SER'),
                            ('Seq_ID', '2'),
                            ('CB', '64.6000')])),
              ('4',
               OrderedDict([('AA3Code', 'GLU'),
                            ('Seq_ID', '4'),
                            ('CB', '28.1200'),
                            ('CG', '33.2720')]))])]
In [38]:
# acceess chemical shifts for specific amino acid and specific atom
starfile.chem_shifts_by_residue(amino_acids_and_atoms={"SER":["HA", "HB2", "HB3"], "ASP": ["CA", "N"]})
Out[38]:
[OrderedDict([('2',
               OrderedDict([('AA3Code', 'SER'),
                            ('Seq_ID', '2'),
                            ('HA', '4.5970'),
                            ('HB2', '4.3010'),
                            ('HB3', '4.0550')])),
              ('3',
               OrderedDict([('AA3Code', 'ASP'),
                            ('Seq_ID', '3'),
                            ('CA', '57.6400'),
                            ('N', '121.1040')]))])]

Writing data from a StarFile object into a file

Data from a StarFile can be written into file in original NMR-STAR format or in equivalent JSON format using write():

  • Writing into a NMR-STAR formatted file:
In [39]:
with open("out/bmr15000_modified.str", "w") as outfile:
    starfile.write(outfile, file_format="nmrstar")
  • Writing into a JSONized NMR-STAR formatted file:
In [40]:
with open("out/bmr15000_modified.json", "w") as outfile:
    starfile.write(outfile, file_format="json")

Converting NMR-STAR files

NMR-STAR files can be converted between the NMR-STAR file format and a JSONized NMR-STAR file format using nmrstarlib.converter and nmrstarlib.translator modules.

One-to-one file conversions

  • Converting from the NMR-STAR file format into its equivalent JSON file format:
In [41]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToStarFile

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToStarFile(from_path="18569", to_path="out/bmr18569.json",
                                         from_format="nmrstar", to_format="json"))
converter.convert()
  • Converting from JSON file format into its equivalent NMR-STAR file format:
In [42]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToStarFile

# Using generated above "bmr18569.json" file
converter = Converter(StarFileToStarFile(from_path="bmr18569.json", to_path="out/bmr18569.str",
                                         from_format="json", to_format="nmrstar"))
converter.convert()

Many-to-many files conversions

  • Converting from the directory of NMR-STAR formatted files into its equivalent JSON formatted files:
In [43]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToStarFile

converter = Converter(StarFileToStarFile(from_path="starfiles_dir_nmrstar", to_path="out/starfiles_dir_json",
                                         from_format="nmrstar", to_format="json"))
converter.convert()
  • Converting from the directory of JSONized NMR-STAR formatted files into NMR-STAR formatted files:
In [44]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToStarFile

converter = Converter(StarFileToStarFile(from_path="starfiles_dir_json", to_path="out/starfiles_dir_nmrstar",
                                         from_format="json", to_format="nmrstar"))
converter.convert()

Note

Many-to-many files and one-to-one file conversions are available. See nmrstarlib.converter for full list of available conversions.

Creating simulated peak lists from NMR-STAR formatted files

Creating simulated peak lists without variance

Chemical shift values and assignment information deposited in NMR-STAR formatted files can be used to generate a large number of simulated peak lists for different types of solution and solid-state NMR experiments. Many different types of standard NMR experiments are defined in the spectrum_description.json configuration file. We will be using HNcoCACB spectrum type for the following examples.

  • Creating a zero-variance HNcoCACB peak list file in sparky-like format from NMR-STAR formatted file:
In [45]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_HNcoCACB.txt",
                                         from_format="nmrstar", to_format="sparky",
                                         spectrum_name="HNcoCACB"))
converter.convert()

The generated 18569_HNcoCACB.txt peak list file should look like the following:

Assignment           w1              w2              w3

SER2H-SER2N-MET1CA           8.225           117.197         55.489
SER2H-SER2N-MET1CB           8.225           117.197         32.848
GLU3H-GLU3N-SER2CA           8.002           119.833         58.593
GLU3H-GLU3N-SER2CB           8.002           119.833         64.057
THR4H-THR4N-GLU3CA           8.956           117.212         55.651
THR4H-THR4N-GLU3CB           8.956           117.212         32.952
...
  • Creating a zero-variance HNcoCACB peak list file in json format from a NMR-STAR formatted file:
In [46]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_HNcoCACB.json",
                                         from_format="nmrstar", to_format="json",
                                         spectrum_name="HNcoCACB"))
converter.convert()

The generated 18569_HNcoCACB.json peak list file should look like the following:

[
 {"Assignment": ["SER2H", "SER2N", "MET1CA"], "Dimensions": [8.225, 117.197, 55.489]},
 {"Assignment": ["SER2H", "SER2N", "MET1CB"], "Dimensions": [8.225, 117.197, 32.848]},
 {"Assignment": ["GLU3H", "GLU3N", "SER2CA"], "Dimensions": [8.002, 119.833, 58.593]},
 {"Assignment": ["GLU3H", "GLU3N", "SER2CB"], "Dimensions": [8.002, 119.833, 64.057]},
 {"Assignment": ["THR4H", "THR4N", "GLU3CA"], "Dimensions": [8.956, 117.212, 55.651]},
 {"Assignment": ["THR4H", "THR4N", "GLU3CB"], "Dimensions": [8.956, 117.212, 32.952]},
 ...
]

Creating simulated peak lists with variance drawn from random normal distribution

  • Creating a HNcoCACB peak list file in sparky-like format and adding noise values to peak dimensions from a single source of variance, i.e. 100% of peaks will have chemical shift values adjusted using noise values from the defined random normal distribution:
In [47]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList
from nmrstarlib.noise import NoiseGenerator

# create parameters dictionary for random normal distribution
parameters = {"H_loc": [0], "C_loc": [0], "N_loc": [0],
              "H_scale": [0.001], "C_scale": [0.01], "N_scale": [0.01]}

# create random normal noise generator
random_normal_noise_generator = NoiseGenerator(parameters)

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_HNcoCACB_ssv_HCN.txt",
                                         from_format="nmrstar", to_format="sparky",
                                         spectrum_name="HNcoCACB",
                                         noise_generator=random_normal_noise_generator))
converter.convert()

The generated 18569_HNcoCACB_ssv_HCN.txt peak list file should look like the following:

Assignment           w1              w2              w3

SER2H-SER2N-MET1CA           8.226026                117.193655              55.477204
SER2H-SER2N-MET1CB           8.224649                117.184255              32.845212
GLU3H-GLU3N-SER2CA           8.003282                119.841221              58.603253
GLU3H-GLU3N-SER2CB           8.002372                119.827019              64.067278
THR4H-THR4N-GLU3CA           8.955568                117.215237              55.663902
THR4H-THR4N-GLU3CB           8.955757                117.206167              32.96412
  • Creating a HNcoCACB peak list file in sparky-like format and adding noise values to H and N peak dimensions but not C peak dimension from a single source of variance, i.e. 100% of peaks will have chemical shift values adjusted using noise values from the defined random normal distribution:
In [48]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList
from nmrstarlib.noise import NoiseGenerator

# create parameters dictionary for random normal distribution
parameters = {"H_loc": [0], "C_loc": [None], "N_loc": [0],
              "H_scale": [0.001], "C_scale": [None], "N_scale": [0.01]}

# create random normal noise generator
random_normal_noise_generator = NoiseGenerator(parameters)

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_HNcoCACB_ssv_HN.txt",
                                         from_format="nmrstar", to_format="sparky",
                                         spectrum_name="HNcoCACB",
                                         noise_generator=random_normal_noise_generator))
converter.convert()

The generated 18569_HNcoCACB_ssv_HN.txt peak list file should look like the following (note the chemical shift values differences in H and N dimensions for peaks that belong to the same spin system):

Assignment           w1              w2              w3

SER2H-SER2N-MET1CA           8.226085                117.191527              55.489
SER2H-SER2N-MET1CB           8.224509                117.204666              32.848
GLU3H-GLU3N-SER2CA           8.001657                119.846806              58.593
GLU3H-GLU3N-SER2CB           8.003165                119.8268                64.057
THR4H-THR4N-GLU3CA           8.956946                117.209486              55.651
THR4H-THR4N-GLU3CB           8.955755                117.209889              32.952
  • Creating a HNcoCACB peak list file in sparky-like format and adding noise values to peak dimensions from two sources of variance, i.e. chemical shift values will be adjusted using noise values from two random normal distributions. In order to specify two sources of variance, we need to provide how we want to split our peak list and provide statistical distribution parameters for both distributions. Let’s say we want 70 % of peaks to have a smaller variance in H and N dimensions and 30 % of peaks to have a larger variance in H and N dimensions:
In [49]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList
from nmrstarlib.noise import NoiseGenerator

# create parameters dictionary for random normal distribution
parameters = {"H_loc": [0, 0], "C_loc": [None, None], "N_loc": [0, 0],
              "H_scale": [0.001, 0.005], "C_scale": [None, None], "N_scale": [0.01, 0.05]}

# create random normal noise generator
random_normal_noise_generator = NoiseGenerator(parameters)

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_HNcoCACB_tsv_HN.txt",
                                         from_format="nmrstar", to_format="sparky",
                                         spectrum_name="HNcoCACB",
                                         plsplit=(70,30),
                                         noise_generator=random_normal_noise_generator))
converter.convert()

The generated 18569.txt peak list file should look like the following (note the larger variance in the last four peaks especially in N dimension):

Assignment           w1              w2              w3

SER2H-SER2N-MET1CA           8.223356                117.208041              55.489
SER2H-SER2N-MET1CB           8.22532             117.184278          32.848
GLU3H-GLU3N-SER2CA           8.00271             119.847153          58.593
GLU3H-GLU3N-SER2CB           8.002822                119.824752              64.057
...
GLU114H-GLU114N-LEU113CA             7.614195                118.672897              56.14
GLU114H-GLU114N-LEU113CB             7.628722                118.565859              43.249
GLY115H-GLY115N-GLU114CA             7.583248                113.45153               57.005
GLY115H-GLY115N-GLU114CB             7.596634                113.472049              30.079

Creating simulated peak lists with variance drawn from other distribution types

  • It is also possible to generate the simulated peak lists using other types of statistical distribution functions. For example, let’s simulate the peak list using noise values drawn from chisquare distribution for 5 degrees of freedom for H and N dimensions from single source of variance.
In [50]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList
from nmrstarlib.noise import NoiseGenerator

# create parameters dictionary for distribution
parameters = {"H_df": [5], "C_df": [None], "N_df": [5]}

# create chisquare noise generator
chisquare_noise_generator = NoiseGenerator(parameters, distribution_name="chisquare")

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_HNcoCACB_ssv_HN_chi2.txt",
                                         from_format="nmrstar", to_format="sparky",
                                         spectrum_name="HNcoCACB",
                                         noise_generator=chisquare_noise_generator))
converter.convert()

The generated 18569_HNcoCACB_ssv_HN_chi2.txt peak list file should look like the following:

Assignment           w1              w2              w3

SER2H-SER2N-MET1CA           12.50083                127.197738              55.489
SER2H-SER2N-MET1CB           10.495158               121.039655              32.848
GLU3H-GLU3N-SER2CA           15.597162               124.603078              58.593
GLU3H-GLU3N-SER2CB           8.340404                126.784481              64.057
THR4H-THR4N-GLU3CA           10.010804               120.476893              55.651
THR4H-THR4N-GLU3CB           11.961498               121.681636              32.952
  • Below is the list of all supported distribution functions along with their parameters if the numpy library is not installed:
{
    {"function": "uniform", "parameters": ["low", "high"]},
    {"function": "triangular", "parameters": ["left", "right", "mode"]},
    {"function": "beta", "parameters": ["a", "b"]},
    {"function": "exponential", "parameters": ["scale"]},
    {"function": "gamma", "parameters": ["shape", "scale"]},
    {"function": "gauss", "parameters": ["mu", "sigma"]},
    {"function": "normal", "parameters": ["loc", "scale"]},
    {"function": "lognormal", "parameters": ["mean", "sigma"]},
    {"function": "vonmises", "parameters": ["mu", "kappa"]},
    {"function": "pareto", "parameters": ["a"]}
}
  • And the list of all supported distribution functions along with their parameters if the numpy library is installed:
{
    {"function": "beta", "parameters": ["a", "b"]},
    {"function": "binomial", "parameters": ["n", "p"]},
    {"function": "chisquare", "parameters": ["df"]},
    {"function": "exponential", "parameters": ["scale"]},
    {"function": "f", "parameters": ["dfnum", "dfden"]},
    {"function": "gamma", "parameters": ["shape", "scale"]},
    {"function": "geometric", "parameters": ["p"]},
    {"function": "gumbel", "parameters": ["loc", "scale"]},
    {"function": "hypergeometric", "parameters": ["ngood", "nbad", "nsample"]},
    {"function": "laplace", "parameters": ["loc", "scale"]},
    {"function": "logistic", "parameters": ["loc", "scale"]},
    {"function": "lognormal", "parameters": ["mean", "sigma"]},
    {"function": "logseries", "parameters": ["p"]},
    {"function": "negative_binomial", "parameters": ["n", "p"]},
    {"function": "noncentral_chisquare", "parameters": ["df", "nonc"]},
    {"function": "noncentral_f", "parameters": ["dfnum", "dfden", "nonc"]},
    {"function": "normal", "parameters": ["loc", "scale"]},
    {"function": "pareto", "parameters": ["a"]},
    {"function": "poisson", "parameters": ["lam"]},
    {"function": "power", "parameters": ["a"]},
    {"function": "rayleigh", "parameters": ["scale"]},
    {"function": "triangular", "parameters": ["left", "mode", "right"]},
    {"function": "uniform", "parameters": ["low", "high"]},
    {"function": "vonmises", "parameters": ["mu", "kappa"]},
    {"function": "wald", "parameters": ["mean", "scale"]},
    {"function": "weibull", "parameters": ["a"]},
    {"function": "zipf", "parameters": ["a"]}
}

Spectrum description configuration file

Spectrum description configuration file (spectrum_description.json) contains descriptions for standard solution and solid-state NMR experiments.

  • List all available experiments:
In [51]:
nmrstarlib.nmrstarlib.list_spectrums()
CANCO
CANCOCX
CBCANH
CBCAcoNH
CCcoNH
HBHAcoNH
HNCA
HNCACB
HNCO
HNcaCO
HNcoCA
HNcoCACB
HSQC
HccoNH
NCA
NCACX
NCO
NCOCX
  • List all available spectrum descriptions:
In [52]:
nmrstarlib.nmrstarlib.list_spectrum_descriptions()
{'CANCO': {'Labels': ['CA', 'N', 'CO-1'],
           'MinNumberPeaksPerSpinSystem': 1,
           'PeakDescriptions': [{'dimensions': ['CA', 'N', 'CO-1'], 'fraction': 1}]},
 'CANCOCX': {'Labels': ['CA', 'N', 'CO-1', 'CX-1'],
             'MinNumberPeaksPerSpinSystem': 2,
             'PeakDescriptions': [{'dimensions': ['CA', 'N', 'CO-1', 'CO-1'], 'fraction': 1},
                                  {'dimensions': ['CA', 'N', 'CO-1', 'CA-1'], 'fraction': 1},
                                  {'dimensions': ['CA', 'N', 'CO-1', 'CB-1'], 'fraction': 1},
                                  {'dimensions': ['CA', 'N', 'CO-1', 'CG-1'], 'fraction': 1},
                                  {'dimensions': ['CA', 'N', 'CO-1', 'CD-1'], 'fraction': 1},
                                  {'dimensions': ['CA', 'N', 'CO-1', 'CE-1'], 'fraction': 1},
                                  {'dimensions': ['CA', 'N', 'CO-1', 'CZ-1'], 'fraction': 1}]},
 'CBCANH': {'Labels': ['CA/CB', 'H', 'N'],
            'MinNumberPeaksPerSpinSystem': 2,
            'PeakDescriptions': [{'dimensions': ['CA', 'H', 'N'], 'fraction': 1},
                                 {'dimensions': ['CB', 'H', 'N'], 'fraction': 0.95},
                                 {'dimensions': ['CA', 'H+1', 'N+1'], 'fraction': 1},
                                 {'dimensions': ['CB', 'H+1', 'N+1'], 'fraction': 0.95}]},
 'CBCAcoNH': {'Labels': ['CA/CB', 'H+1', 'N+1'],
              'MinNumberPeaksPerSpinSystem': 2,
              'PeakDescriptions': [{'dimensions': ['CA', 'H+1', 'N+1'], 'fraction': 1},
                                   {'dimensions': ['CB', 'H+1', 'N+1'], 'fraction': 0.95}]},
 'CCcoNH': {'Labels': ['CX-1', 'N', 'H'],
            'MinNumberPeaksPerSpinSystem': 2,
            'PeakDescriptions': [{'dimensions': ['CA-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['CB-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['CG-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['CD-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['CE-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['CZ-1', 'N', 'H'], 'fraction': 1}],
            'ResonanceLimit': {'ALA': ['H', 'N', 'CA', 'CB'],
                               'ARG': ['H', 'N', 'CA', 'CB', 'CG', 'CD', 'CZ'],
                               'ASN': ['H', 'N', 'CA', 'CB', 'CG'],
                               'ASP': ['H', 'N', 'CA', 'CB', 'CG'],
                               'CYS': ['H', 'N', 'CA', 'CB'],
                               'GLN': ['H', 'N', 'CA', 'CB', 'CG', 'CD'],
                               'GLU': ['H', 'N', 'CA', 'CB', 'CG', 'CD'],
                               'GLY': ['H', 'N', 'CA'],
                               'HIS': ['H', 'N', 'CA', 'CB'],
                               'ILE': ['H', 'N', 'CA', 'CB', 'CG1', 'CG2', 'CD1'],
                               'LEU': ['H', 'N', 'CA', 'CB', 'CG', 'CD1', 'CD2'],
                               'LYS': ['H', 'N', 'CA', 'CB', 'CG', 'CD', 'CE'],
                               'MET': ['H', 'N', 'CA', 'CB', 'CG', 'CE'],
                               'PHE': ['H', 'N', 'CA', 'CB'],
                               'SER': ['H', 'N', 'CA', 'CB'],
                               'THR': ['H', 'N', 'CA', 'CB', 'CG2'],
                               'TRP': ['H', 'N', 'CA', 'CB'],
                               'TYR': ['H', 'N', 'CA', 'CB'],
                               'VAL': ['H', 'N', 'CA', 'CB', 'CG1', 'CG2']}},
 'HBHAcoNH': {'Labels': ['HA/HB-1', 'N', 'H'],
              'MinNumberPeaksPerSpinSystem': 2,
              'PeakDescriptions': [{'dimensions': ['HA-1', 'N', 'H'], 'fraction': 1},
                                   {'dimensions': ['HB-1', 'N', 'H'], 'fraction': 1}]},
 'HNCA': {'Labels': ['H', 'N', 'CA'],
          'MinNumberPeaksPerSpinSystem': 1,
          'PeakDescriptions': [{'dimensions': ['H', 'N', 'CA'], 'fraction': 1},
                               {'dimensions': ['H', 'N', 'CA-1'], 'fraction': 1}]},
 'HNCACB': {'Labels': ['H', 'N', 'CA/CB'],
            'MinNumberPeaksPerSpinSystem': 2,
            'PeakDescriptions': [{'dimensions': ['H', 'N', 'CA'], 'fraction': 1},
                                 {'dimensions': ['H', 'N', 'CB'], 'fraction': 0.95},
                                 {'dimensions': ['H', 'N', 'CA-1'], 'fraction': 1},
                                 {'dimensions': ['H', 'N', 'CB-1'], 'fraction': 0.95}]},
 'HNCO': {'Labels': ['H', 'N', 'CO-1'],
          'MinNumberPeaksPerSpinSystem': 1,
          'PeakDescriptions': [{'dimensions': ['H', 'N', 'CO-1'], 'fraction': 1}]},
 'HNcaCO': {'Labels': ['H', 'N', 'CO'],
            'MinNumberPeaksPerSpinSystem': 1,
            'PeakDescriptions': [{'dimensions': ['H', 'N', 'CO'], 'fraction': 1},
                                 {'dimensions': ['H', 'N', 'CO-1'], 'fraction': 1}]},
 'HNcoCA': {'Labels': ['H', 'N', 'CA'],
            'MinNumberPeaksPerSpinSystem': 1,
            'PeakDescriptions': [{'dimensions': ['H', 'N', 'CA-1'], 'fraction': 1}]},
 'HNcoCACB': {'Labels': ['H', 'N', 'CA/CB-1'],
              'MinNumberPeaksPerSpinSystem': 2,
              'PeakDescriptions': [{'dimensions': ['H', 'N', 'CA-1'], 'fraction': 1},
                                   {'dimensions': ['H', 'N', 'CB-1'], 'fraction': 0.95}]},
 'HSQC': {'Labels': ['H', 'N'],
          'MinNumberPeaksPerSpinSystem': 1,
          'PeakDescriptions': [{'dimensions': ['H', 'N'], 'fraction': 1}]},
 'HccoNH': {'Labels': ['HX-1', 'N', 'H'],
            'MinNumberPeaksPerSpinSystem': 2,
            'PeakDescriptions': [{'dimensions': ['HA-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['HB-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['HG-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['HD-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['HE-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['HH-1', 'N', 'H'], 'fraction': 1},
                                 {'dimensions': ['HZ-1', 'N', 'H'], 'fraction': 1}],
            'ResonanceLimit': {'ALA': ['H', 'N', 'HA', 'HB'],
                               'ARG': ['H', 'N', 'HA', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3'],
                               'ASN': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'ASP': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'CYS': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'GLN': ['H', 'N', 'HA', 'HB2', 'HB3', 'HG2', 'HG3'],
                               'GLU': ['H', 'N', 'HA', 'HB2', 'HB3', 'HG2', 'HG3'],
                               'GLY': ['H', 'N', 'HA2', 'HA3'],
                               'HIS': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'ILE': ['H', 'N', 'HA', 'HB', 'HG12', 'HG13', 'HG2', 'HD1'],
                               'LEU': ['H', 'N', 'HA', 'HB2', 'HB3', 'HD1', 'HD2'],
                               'LYS': ['H', 'N', 'HA', 'HB2', 'HB3', 'HG2', 'HG3', 'HD2', 'HD3', 'HE2', 'HE3'],
                               'MET': ['H', 'N', 'HA', 'HB2', 'HB3', 'HG2', 'HG3', 'HE'],
                               'PHE': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'SER': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'THR': ['H', 'N', 'HA', 'HB', 'HG2'],
                               'TRP': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'TYR': ['H', 'N', 'HA', 'HB2', 'HB3'],
                               'VAL': ['H', 'N', 'HA', 'HB', 'HG1', 'HG2']}},
 'NCA': {'Labels': ['N', 'CA'],
         'MinNumberPeaksPerSpinSystem': 1,
         'PeakDescriptions': [{'dimensions': ['N', 'CA'], 'fraction': 1}]},
 'NCACX': {'Labels': ['N', 'CA', 'CX'],
           'MinNumberPeaksPerSpinSystem': 2,
           'PeakDescriptions': [{'dimensions': ['N', 'CA', 'CO'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CA'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CB'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CG'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CD'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CE'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CZ'], 'fraction': 1}]},
 'NCO': {'Labels': ['N', 'CO-1'],
         'MinNumberPeaksPerSpinSystem': 1,
         'PeakDescriptions': [{'dimensions': ['N', 'CO-1'], 'fraction': 1}]},
 'NCOCX': {'Labels': ['N', 'CO-1', 'CX-1'],
           'MinNumberPeaksPerSpinSystem': 2,
           'PeakDescriptions': [{'dimensions': ['N', 'CO-1', 'CA-1'], 'fraction': 1},
                                {'dimensions': ['N', 'CO-1', 'CB-1'], 'fraction': 1},
                                {'dimensions': ['N', 'CO-1', 'CG-1'], 'fraction': 1},
                                {'dimensions': ['N', 'CO-1', 'CD-1'], 'fraction': 1},
                                {'dimensions': ['N', 'CO-1', 'CE-1'], 'fraction': 1},
                                {'dimensions': ['N', 'CO-1', 'CZ-1'], 'fraction': 1}]}}
  • List specific spectrum descriptions:
In [53]:
nmrstarlib.nmrstarlib.list_spectrum_descriptions("HNcoCACB", "NCACX")
{'HNcoCACB': {'Labels': ['H', 'N', 'CA/CB-1'],
              'MinNumberPeaksPerSpinSystem': 2,
              'PeakDescriptions': [{'dimensions': ['H', 'N', 'CA-1'], 'fraction': 1},
                                   {'dimensions': ['H', 'N', 'CB-1'], 'fraction': 0.95}]}}
{'NCACX': {'Labels': ['N', 'CA', 'CX'],
           'MinNumberPeaksPerSpinSystem': 2,
           'PeakDescriptions': [{'dimensions': ['N', 'CA', 'CO'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CA'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CB'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CG'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CD'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CE'], 'fraction': 1},
                                {'dimensions': ['N', 'CA', 'CZ'], 'fraction': 1}]}}
  • Adding a custom experiment description and simulating peak list based on it. Custom spectrum description can be added in several ways:

    1. Create additional json configuration with spectrum description and update SPECTRUM_DESCRIPTIONS dict. Content of custom_spectrum_description.json.
    2. Define dictionary with new spectrum description and update SPECTRUM_DESCRIPTIONS dict.
  1. Create additional json configuration with spectrum description and updating SPECTRUM_DESCRIPTIONS dict. Content of custom_spectrum_description.json.
{
    "NCACX_custom": {
        "Labels": ["N", "CA", "CX"],
        "MinNumberPeaksPerSpinSystem": 2,
        "PeakDescriptions": [
            {"fraction": 1, "dimensions": ["N", "CA", "CO"]},
            {"fraction": 1, "dimensions": ["N", "CA", "CA"]},
            {"fraction": 1, "dimensions": ["N", "CA", "CB"]}
        ]
    }
}
In [54]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList
from nmrstarlib.noise import NoiseGenerator

# update SPECTRUM_DESCRIPTIONS
nmrstarlib.nmrstarlib.update_constants(spectrum_descriptions_cfg="path/to/custom_spectrum_description.json")

# create parameters dictionary for random normal distribution
parameters = {"H_loc": [None, None], "C_loc": [0, 0], "N_loc": [0, 0],
              "H_scale": [None, None], "C_scale": [0.01, 0.05], "N_scale": [0.01, 0.05]}

# create random normal noise generator
random_normal_noise_generator = NoiseGenerator(parameters)

converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_NCACX_custom.txt",
                                         from_format="nmrstar", to_format="sparky",
                                         spectrum_name="NCACX_custom",
                                         plsplit=(70,30),
                                         noise_generator=random_normal_noise_generator))
converter.convert()
  1. Define dictionary with new spectrum description and update SPECTRUM_DESCRIPTIONS dict.
In [55]:
from nmrstarlib.converter import Converter
from nmrstarlib.translator import StarFileToPeakList
from nmrstarlib.noise import NoiseGenerator

custom_experiment_type = {
    "NCACX_custom": {
        "Labels": ["N", "CA", "CX"],
        "MinNumberPeaksPerSpinSystem": 2,
        "PeakDescriptions": [
            {"fraction": 1, "dimensions": ["N", "CA", "CO"]},
            {"fraction": 1, "dimensions": ["N", "CA", "CA"]},
            {"fraction": 1, "dimensions": ["N", "CA", "CB"]}
        ]
    }
}

# update SPECTRUM_DESCRIPTION
nmrstarlib.nmrstarlib.SPECTRUM_DESCRIPTIONS.update(custom_experiment_type)

# create parameters dictionary for random normal distribution
parameters = {"H_loc": [0, 0], "C_loc": [None, None], "N_loc": [0, 0],
              "H_scale": [0.001, 0.005], "C_scale": [None, None], "N_scale": [0.01, 0.05]}

# create random normal noise generator
random_normal_noise_generator = NoiseGenerator(parameters)

# Using valid BMRB id to access file from URL: from_path="18569"
converter = Converter(StarFileToPeakList(from_path="18569", to_path="out/18569_NCACX_custom.txt",
                                         from_format="nmrstar", to_format="sparky",
                                         spectrum_name="NCACX_custom",
                                         plsplit=(70,30),
                                         noise_generator=random_normal_noise_generator))
converter.convert()

Visualizing chemical shifts values

Chemical shifts values can be visualized using the nmrstarlib.csviewer Chemical Shifts Viewer module.

  • Visualize all available chemical shifts for all amino acids.
In [56]:
from nmrstarlib.csviewer import CSViewer

csviewer = CSViewer(from_path="18569", filename="out/18569_chem_shifts_all", csview_format="png")
csviewer.csview(view=False)

nmrstarlib.csviewer output example:

_images/18569_chem_shifts_all.png
  • Visualize CA, CB, CG, and CG2 chemical shifts for specific amino acids.
In [57]:
from nmrstarlib.csviewer import CSViewer

csviewer = CSViewer(from_path="18569", amino_acids=["GLU", "THR"], atoms=["CA", "CB", "CG", "CG2"],
                    filename="out/18569_chem_shifts_SER_THR_CA_CB_CG_CG2", csview_format="png")
csviewer.csview(view=False)

nmrstarlib.csviewer output example:

_images/18569_chem_shifts_GLU_THR_CA_CB_CG_CG2.png
  • Visualize specific atoms for specific amino acids.
In [58]:
from nmrstarlib.csviewer import CSViewer

csviewer = CSViewer(from_path="18569", amino_acids_and_atoms={"GLU": ["CA", "CB"], "THR": ["HA", "HB"]},
                    filename="out/18569_chem_shifts_GLU_CA_CB_THR_HA_HB", csview_format="png")
csviewer.csview(view=False)

nmrstarlib.csviewer output example:

_images/18569_chem_shifts_GLU_CA_CB_THR_HA_HB.png

Command Line Interface

Command Line Interface functionality:
  • Convert from the NMR-STAR file format into its equivalent JSON file format and vice versa.
  • Create simulated peak list files using chemical shift and assignment information.
  • Visualize assigned chemical shift values.
In [59]:
! python3 -m nmrstarlib --help
nmrstarlib command-line interface

Usage:
    nmrstarlib -h | --help
    nmrstarlib --version
    nmrstarlib convert (<from-path> <to-path>) [--from-format=<format>] [--to-format=<format>] [--bmrb-url=<url>] [--nmrstar-version=<version>] [--verbose]
    nmrstarlib csview <starfile-path> [--aa=<aa>] [--at=<at>] [--aa-at=<aa-at>] [--csview-outfile=<path>] [--csview-format=<format>] [--bmrb-url=<url>] [--nmrstar-version=<version>] [--verbose] [--show]
    nmrstarlib plsimulate (<from-path> <to-path> <spectrum>) [--from-format=<format>] [--to-format=<format>] [--plsplit=<%>] [--distribution=<func>] [--seed=<value>] [--H=<value>] [--C=<value>] [--N=<value>] [--bmrb-url=<url>] [--nmrstar-version=<version>] [--spectrum-descriptions=<path>] [--verbose]

Options:
    -h, --help                      Show this screen.
    --version                       Show version.
    --verbose                       Print what files are processing.
    --show                          Display chemical shifts image generated by 'csview' command by default image viewer.
    --from-format=<format>          Input file format, available formats: nmrstar, json [default: nmrstar].
    --to-format=<format>            Output file format, available formats: nmrstar, json [default: json].
    --nmrstar-version=<version>     Version of NMR-STAR format to use, available: 2, 3 [default: 3].
    --bmrb-url=<url>                URL to BMRB REST interface [default: http://rest.bmrb.wisc.edu/bmrb/NMR-STAR3/].
    --aa=<aa>                       Comma-separated amino acid three-letter codes (e.g. --aa=ALA,SER).
    --at=<at>                       Comma-separated BMRB atom codes (e.g. --at=CA,CB).
    --aa-at=<aa-at>                 Amino acid three-letter codes (keys) and corresponding atoms (values) (e.g. --aa-at=ALA-CA,CB:LYS-CB,CG,CD).
    --csview-outfile=<path>         Where to save chemical shifts table.
    --csview-format=<format>        Format to which save chemical shift table [default: svg].
    --plsplit=<%>                   How to split peak list into chunks by percent [default: 100].
    --spectrum-descriptions=<path>  Path to custom spectrum descriptions file.
    --distribution=<func>           Statistical distribution function [default: normal].
    --seed=<value>                  Integer value used to initialize a pseudorandom number generator during peak list simulation.
    --H=<value>                     Statistical distribution parameter(s) for H dimension.
    --C=<value>                     Statistical distribution parameter(s) for C dimension.
    --N=<value>                     Statistical distribution parameter(s) for N dimension.

CLI Converting NMR-STAR files in bulk

CLI one-to-one file conversions

  • Convert from a local file in NMR-STAR format to a local file in JSON format:
In [60]:
! python3 -m nmrstarlib convert bmr18569.str out/bmr18569.json \
          --from-format=nmrstar --to-format=json
  • Convert from a local file in JSON format to a local file in NMR-STAR format:
In [61]:
! python3 -m nmrstarlib convert bmr18569.json out/bmr18569.str \
          --from-format=json --to-format=nmrstar
  • Convert from a compressed local file in NMR-STAR format to a compressed local file in JSON format:
In [62]:
! python3 -m nmrstarlib convert bmr18569.str.gz out/bmr18569.json.gz \
          --from-format=nmrstar --to-format=json
  • Convert from a compressed local file in JSON format to a compressed local file in NMR-STAR format:
In [63]:
! python3 -m nmrstarlib convert bmr18569.json.gz out/bmr18569.str.gz \
          --from-format=json --to-format=nmrstar
  • Convert from a uncompressed URL file in NMR-STAR format to a compressed local file in JSON format:
In [64]:
! python3 -m nmrstarlib convert 18569 out/bmr18569.json.bz2 \
          --from-format=nmrstar --to-format=json

Note

See nmrstarlib.converter for full list of available conversions.

CLI many-to-many files conversions

  • Convert from a directory of files in NMR-STAR format to a directory of files in JSON format:
In [65]:
! python3 -m nmrstarlib convert starfiles_dir_nmrstar out/starfiles_dir_json \
          --from-format=nmrstar --to-format=json
  • Convert from a directory of files in JSON format to a directory of files in NMR-STAR format:
In [66]:
! python3 -m nmrstarlib convert starfiles_dir_json out/starfiles_dir_nmrstar \
          --from-format=json --to-format=nmrstar
  • Convert from a directory of files in NMR-STAR format to a zip archive of files in JSON format:
In [67]:
! python3 -m nmrstarlib convert starfiles_dir_nmrstar out/starfiles_json.zip \
          --from-format=nmrstar --to-format=json
  • Convert from a compressed tar archive of files in JSON format to a directory of files in NMR-STAR format:
In [68]:
! python3 -m nmrstarlib convert starfiles_json.tar.gz out/starfiles_dir_nmrstar \
          --from-format=json --to-format=nmrstar
  • Convert from a zip archive of files in NMR-STAR format to a compressed tar archive of files in JSON format:
In [69]:
! python3 -m nmrstarlib convert starfiles_nmrstar.zip out/starfiles_json.tar.bz2 \
          --from-format=nmrstar --to-format=json

Note

See nmrstarlib.converter for full list of available conversions.

CLI Creating simulated peak list files from NMR-STAR files in bulk

CLI one-to-one file simulations

  • Creating a zero-variance HNcoCACB peak list file in sparky-like format from local NMR-STAR formatted file (bmr18569.str):
In [70]:
! python3 -m nmrstarlib plsimulate bmr18569.str out/18569_HNcoCACB.txt HNcoCACB \
          --from-format=nmrstar --to-format=sparky
  • Creating a HNcoCACB peak list file in sparky-like format and adding noise values to peak dimensions from a single source of variance, i.e. 100% of peaks will have chemical shift values adjusted using noise values from the defined random normal distribution (note that we can use 18569 BMRB id instead of local file):
In [71]:
! python3 -m nmrstarlib plsimulate 18569 out/18569_HNcoCACB_ssv_HCN.txt HNcoCACB \
          --from-format=nmrstar --to-format=sparky \
          --H=0,0.001 --N=0,0.01 --C=0,0.01
  • Creating a HNcoCACB peak list file in sparky-like format and adding noise values to peak dimensions from a single source of variance, i.e. 100% of peaks will have chemical shift values adjusted using noise values from the defined chisquare distribution for degrees of freedom equal to 5:
In [72]:
! python3 -m nmrstarlib plsimulate 18569 out/18569_HNcoCACB_ssv_HCN_chi2.txt HNcoCACB \
          --from-format=nmrstar --to-format=sparky \
          --H=5 --N=5 --C=5 --distribution=chisquare
  • Creating a HNcoCACB peak list file in sparky-like format and adding noise values to H and N peak dimensions but not C peak dimension from a single source of variance, i.e. 100% of peaks will have chemical shift values adjusted using noise values from the defined random normal distribution (note that we can use compressed bmr18569.str.gz file):
In [73]:
! python3 -m nmrstarlib plsimulate bmr18569.str.gz out/18569_HNcoCACB_ssv_HN.txt HNcoCACB \
          --from-format=nmrstar --to-format=sparky \
          --H=0,0.001 --N=0,0.01
  • Creating a HNcoCACB peak list file in sparky-like format and adding noise values to peak dimensions from two sources of variance, i.e. chemical shift values will be adjusted using noise values from two random normal distributions. In order to specify two sources of variance, we need to provide how we want to split our peak list and provide statistical distribution parameters for both distributions. Let’s say we want 70 % of peaks to have a smaller variance in H and N dimensions and 30 % of peaks to have a larger variance in H and N dimensions. Note that values per split are separated by : and then parameters are separated by ,.
In [74]:
! python3 -m nmrstarlib plsimulate 18569 out/18569_HNcoCACB_tsv_HN.txt HNcoCACB \
          --from-format=nmrstar --to-format=sparky \
          --plsplit=70,30 --H=0:0,0.001:0.005 --N=0:0,0.01:0.05

Note

See nmrstarlib.converter for full list of available one-to-one and many-to-many input and output formats.

CLI many-to-many files simulations

  • Simulate zero-variance HNcoCACB peak lists from a directory of NMR-STAR formatted files to a directory of peak list files:
In [75]:
! python3 -m nmrstarlib plsimulate starfiles_dir_nmrstar out/peaklists_dir HNcoCACB \
          --from-format=nmrstar --to-format=sparky
  • Simulate HNcoCACB peak lists from a directory of NMR-STAR formatted files to a zip archive of peak list files, add random normal noise values to H and N peak dimensions:
In [76]:
! python3 -m nmrstarlib plsimulate starfiles_dir_nmrstar out/peaklists.zip HNcoCACB \
          --from-format=nmrstar --to-format=sparky --H=0,0.001 --N=0,0.01
  • Simulate NCACX peak lists from a directory of NMR-STAR formatted files to a tar.gz archive of peak list files, add random normal noise values to C and N peak dimensions using two sources of variance, 70 % of peaks will have smaller variance, 30 % of peaks will have larger variance:
In [77]:
! python3 -m nmrstarlib plsimulate starfiles_dir_nmrstar out/peaklists.tar.gz NCACX \
          --from-format=nmrstar --to-format=sparky --plsplit=70,30 \
          --C=0:0,0.01:0.05 --N=0:0,0.01:0.07

Note

See nmrstarlib.converter for full list of available one-to-one and many-to-many input and output formats.

CLI Visualizing chemical shift values

  • Visualize chemical shift values for the entire sequence:
In [78]:
! python3 -m nmrstarlib csview 18569 --csview-outfile=out/18569_chem_shifts_all \
          --csview-format=png
_images/18569_chem_shifts_all.png
  • Visualize CA, CB, CG, and CG2 chemical shift values for GLU and THR amino acid residues:
In [79]:
! python3 -m nmrstarlib csview 18569 --aa=GLU,THR --at=CA,CB,CG,CG2 \
          --csview-outfile=out/18569_chem_shifts_GLU_THR_CA_CB_CG_CG2 \
          --csview-format=png
_images/18569_chem_shifts_GLU_THR_CA_CB_CG_CG2.png
  • Visualize specific atoms for specific amino acids.
In [80]:
! python3 -m nmrstarlib csview 18569 --aa-at=GLU-CA,CB:THR-HA,HB \
          --csview-outfile=out/18569_chem_shifts_GLU_CA_CB_THR_HA_HB \
          --csview-format=png

nmrstarlib.csviewer output example:

_images/18569_chem_shifts_GLU_CA_CB_THR_HA_HB.png