Database#

class sunpy.database.Database(url[, CacheClass[, cache_size[, default_waveunit]]])[source]#

Bases: object

Parameters:
  • url (str) – A URL describing the database. This value is simply passed to sqlalchemy.create_engine() If not specified the value will be read from the sunpy config file.

  • CacheClass (sunpy.database.caching.BaseCache) – A concrete cache implementation of the abstract class BaseCache. Builtin supported values for this parameters are sunpy.database.caching.LRUCache and sunpy.database.caching.LFUCache. The default value is sunpy.database.caching.LRUCache.

  • cache_size (int) – The maximum number of database entries, default is no limit.

  • default_waveunit (str or Quantity, optional) – The wavelength unit that will be used if an entry is added to the database but its wavelength unit cannot be found (either in the file or the VSO query result block, depending on the way the entry was added). If an Unit is passed, it is assigned to default_waveunit. If a str is passed, it will be converted to a Quantity through the Quantity initializer, and then assigned to default_waveunit. If an invalid string is passed, WaveunitNotConvertibleError is raised. If None (the default), attempting to add an entry without knowing the wavelength unit results in a sunpy.database.tables.WaveunitNotFoundError.

Attributes Summary

cache_maxsize

cache_size

tags

url

The sqlalchemy url of the database instance

Methods Summary

add(database_entry[, ignore_already_added])

Add the given database entry to the database table.

add_from_dir(path[, recursive, pattern, ...])

Search the given directory for FITS files and use their FITS headers to add new entries to the database.

add_from_fido_search_result(search_result[, ...])

Generate database entries from a Fido search result and add all the generated entries to this database.

add_from_file(file[, ignore_already_added])

Generate as many database entries as there are FITS headers in the given file and add them to the database.

add_from_hek_query_result(query_result[, ...])

Add database entries from a HEK query result.

add_from_vso_query_result(query_result[, ...])

Generate database entries from a VSO query result and add all the generated entries to this database.

add_many(database_entries[, ...])

Add a row of database entries "at once".

clear()

Remove all entries from the database.

clear_histories()

Clears all entries from the undo and redo history.

commit()

Flush pending changes and commit the current transaction.

display_entries([columns, sort])

download_from_hek_query_result(query_result)

Add new database entries from a hek query result by converting it into vso query and download the corresponding data files.

download_from_vso_query_result(query_result)

Add new database entries from a VSO query result and download the corresponding data files.

edit(database_entry, **kwargs)

Change the given database entry so that it interprets the passed key-value pairs as new values where the keys represent the attributes of this entry.

fetch(*query, **kwargs)

Check if the query has already been used to collect new data.

get_entry_by_id(entry_id)

Get a database entry by its unique ID number.

get_tag(tag_name)

Get the tag which has the given name.

redo([n])

redo the last n commands.

remove(database_entry)

Remove the given database entry from the database table.

remove_many(database_entries)

Remove a row of database entries "at once".

remove_tag(database_entry, tag_name)

Remove the given tag from the database entry.

search(*query[, sortby])

Send the given query to the database and return a list of database entries that satisfy all of the given attributes.

set_cache_size(cache_size)

Set a new value for the maximum number of database entries in the cache.

show_in_browser([columns, sort, jsviewer])

star(database_entry[, ignore_already_starred])

Mark the given database entry as starred.

tag(database_entry, *tags)

Assign the given database entry the given tags.

undo([n])

undo the last n commands.

unstar(database_entry[, ...])

Remove the starred mark of the given entry.

Attributes Documentation

cache_maxsize#
cache_size#
tags#
url#

The sqlalchemy url of the database instance

Methods Documentation

add(database_entry, ignore_already_added=False)[source]#

Add the given database entry to the database table.

Parameters:
  • database_entry (sunpy.database.tables.DatabaseEntry) – The database entry that will be added to this database.

  • ignore_already_added (bool, optional) – If True, attempts to add an already existing database entry will result in a sunpy.database.EntryAlreadyAddedError. Otherwise, a new entry will be added and there will be duplicates in the database.

add_from_dir(path, recursive=False, pattern='*', ignore_already_added=False, time_string_parse_format=None)[source]#

Search the given directory for FITS files and use their FITS headers to add new entries to the database. Note that one entry in the database is assigned to a list of FITS headers, so not the number of FITS headers but the number of FITS files which have been read determine the number of database entries that will be added. FITS files are detected by reading the content of each file, the pattern argument may be used to avoid reading entire directories if one knows that all FITS files have the same filename extension.

Parameters:
  • path (str) – The directory where to look for FITS files.

  • recursive (bool, optional) – If True, the given directory will be searched recursively. Otherwise, only the given directory and no subdirectories are searched. The default is False, i.e. the given directory is not searched recursively.

  • pattern (str, optional) – The pattern can be used to filter the list of filenames before the files are attempted to be read. The default is to collect all files. This value is passed to the function fnmatch.filter(), see its documentation for more information on the supported syntax.

  • ignore_already_added (bool, optional) – See sunpy.database.Database.add().

  • time_string_parse_format (str, optional) – Fallback timestamp format which will be passed to strptime if sunpy.time.parse_time is unable to automatically read the date-obs metadata.

add_from_fido_search_result(search_result, ignore_already_added=False)[source]#

Generate database entries from a Fido search result and add all the generated entries to this database.

Parameters:
add_from_file(file, ignore_already_added=False)[source]#

Generate as many database entries as there are FITS headers in the given file and add them to the database.

Parameters:
  • file (str, file object) – Either a path pointing to a FITS file or an opened file-like object. If an opened file object, its mode must be one of the following rb, rb+, or ab+.

  • ignore_already_added (bool, optional) – See sunpy.database.Database.add().

add_from_hek_query_result(query_result, ignore_already_added=False)[source]#

Add database entries from a HEK query result.

Parameters:
add_from_vso_query_result(query_result, ignore_already_added=False)[source]#

Generate database entries from a VSO query result and add all the generated entries to this database.

Parameters:
add_many(database_entries, ignore_already_added=False)[source]#

Add a row of database entries “at once”. If this method is used, only one entry is saved in the undo history.

Parameters:
  • database_entries (list) – The list of DatabaseEntry that will be added to the database.

  • ignore_already_added (bool, optional) – See Database.add

clear()[source]#

Remove all entries from the database. This operation can be undone using the undo() method.

clear_histories()[source]#

Clears all entries from the undo and redo history.

commit()[source]#

Flush pending changes and commit the current transaction. This is a shortcut for sunpy.database.Database.commit().

display_entries(columns=None, sort=False)[source]#
download_from_hek_query_result(query_result, client=None, path=None, progress=False, ignore_already_added=False, overwrite=False)[source]#

Add new database entries from a hek query result by converting it into vso query and download the corresponding data files.

Parameters:
  • query_result (HEKTable or HEKRow) – The value returned by sunpy.net.hek.HEKClient.search().

  • client (sunpy.net.vso.VSOClient, optional) – VSO Client instance to use for search and download. If not specified a new instance will be created.

  • path (str) – Path to download the files.

  • progress (bool) – If True, displays the progress bar during file download.

  • ignore_already_added (bool) – See sunpy.database.Database.add().

  • overwrite (bool, optional) – If True, matching database entries from the query results will be deleted and replaced with new database entries, with all files getting downloaded. Otherwise, no new file download and update of matching database entries takes place.

download_from_vso_query_result(query_result, client=None, path=None, progress=False, ignore_already_added=False, overwrite=False)[source]#

Add new database entries from a VSO query result and download the corresponding data files. See sunpy.database.Database.fetch() for information about the caching mechanism used and about the parameters client, path, progress.

Parameters:
edit(database_entry, **kwargs)[source]#

Change the given database entry so that it interprets the passed key-value pairs as new values where the keys represent the attributes of this entry. If no keywords arguments are given, ValueError is raised.

fetch(*query, **kwargs)[source]#

Check if the query has already been used to collect new data.

If yes, query the database using the method sunpy.database.Database.search() and return the result.

Otherwise, the retrieved search result is used to download all files that belong to this search result. After that, all the gathered information (the one from the query result and the one from the downloaded files) is added to the database in a way that each header is represented by one database entry.

It uses the sunpy.database.Database._download_and_collect_entries() method to download files, which uses query result block level caching. This means that files will not be downloaded for any query result block that had its files downloaded previously. If files for Query A were already downloaded, and then Query B is made which has some result blocks common with Query A, then files for these common blocks will not be downloaded again. Files will only be downloaded for those blocks which are new or haven’t had their files downloaded yet.

If querying results in no data, no operation is performed. Concrete, this means that no entry is added to the database and no file is downloaded.

Parameters:
  • *query (list) – A variable number of attributes that are chained together via the boolean AND operator. The | operator may be used between attributes to express the boolean OR operator.

  • path (str, optional) – The directory into which files will be downloaded.

  • overwrite (bool, optional) – If True, matching database entries from the query results will be deleted and replaced with new database entries, with all files getting downloaded. Otherwise, no new file download and update of matching database entries takes place.

  • client (sunpy.net.vso.VSOClient, optional) – VSO Client instance to use for search and download. If not specified a new instance will be created.

  • progress (bool, optional) – If True, displays the progress bar during file download.

  • methods (str or iterable of str, optional) – Set VSOClient download method, see`~sunpy.net.vso.VSOClient.fetch` for details.

Examples

This method can be used along with the overwrite=True argument to overwrite and redownload files corresponding to the query, even if its entries are already present in the database. Note that the overwrite=True argument deletes the old matching database entries and new database entries are added with information from the redownloaded files.

>>> from sunpy.database import Database
>>> from sunpy.database.tables import display_entries
>>> from sunpy.net import vso, attrs as a
>>> database = Database('sqlite:///:memory:')
>>> database.fetch(a.Time('2012-08-05', '2012-08-05 00:00:05'),
...                a.Instrument.aia)  
>>> print(display_entries(database,
...                       ['id', 'observation_time_start', 'observation_time_end',
...                        'instrument', 'wavemin', 'wavemax']))  
    id observation_time_start observation_time_end instrument wavemin wavemax
    --- ---------------------- -------------------- ---------- ------- -------
      1    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4
      2    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4
      3    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
      4    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
>>> database.fetch(a.Time('2012-08-05', '2012-08-05 00:00:01'),
...                a.Instrument.aia, overwrite=True)  
>>> print(display_entries(database,
...                       ['id', 'observation_time_start', 'observation_time_end',
...                        'instrument', 'wavemin', 'wavemax']))  
     id observation_time_start observation_time_end instrument wavemin wavemax
    --- ---------------------- -------------------- ---------- ------- -------
      3    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
      4    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
      5    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4
      6    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4

Here the first 2 entries (IDs 1 and 2) were overwritten and its files were redownloaded, resulting in the entries with IDs 5 and 6.

get_entry_by_id(entry_id)[source]#

Get a database entry by its unique ID number. If an entry with the given ID does not exist, sunpy.database.EntryNotFoundError is raised.

get_tag(tag_name)[source]#

Get the tag which has the given name. If no such tag exists, sunpy.database.NoSuchTagError is raised.

redo(n=1)[source]#

redo the last n commands.

remove(database_entry)[source]#

Remove the given database entry from the database table.

remove_many(database_entries)[source]#

Remove a row of database entries “at once”. If this method is used, only one entry is saved in the undo history.

Parameters:

database_entries (list) – The DatabaseEntry that will be removed from the database.

remove_tag(database_entry, tag_name)[source]#

Remove the given tag from the database entry. If the tag is not connected to any entry after this operation, the tag itself is removed from the database as well.

Raises:

sunpy.database.NoSuchTagError – If the tag is not connected to the given entry.

search(*query[, sortby])[source]#

Send the given query to the database and return a list of database entries that satisfy all of the given attributes.

Apart from the attributes supported by the VSO interface, the following attributes are supported:

An important difference to the VSO attributes is that these attributes may also be used in negated form using the tilde ~ operator.

Parameters:
  • *query (list) – A variable number of attributes that are chained together via the boolean AND operator. The | operator may be used between attributes to express the boolean OR operator.

  • sortby (str, optional) – The column by which to sort the returned entries. The default is to sort by the start of the observation. See the attributes of sunpy.database.tables.DatabaseEntry for a list of all possible values.

Returns:

table (list) – List of sunpy.database.tables.DatabaseEntry objects that satisfy all of the given attributes.

Raises:

TypeError – if no attribute is given or if some keyword argument other than ‘sortby’ is given.

Examples

The query in the following example searches for all non-starred entries with the tag ‘foo’ or ‘bar’ (or both).

>>> database.search(~attrs.Starred(), attrs.Tag('foo') | attrs.Tag('bar'))   
set_cache_size(cache_size)[source]#

Set a new value for the maximum number of database entries in the cache. Use the value float('inf') to disable caching. If the new cache is smaller than the previous one and cannot contain all the entries anymore, entries are removed from the cache until the number of entries equals the cache size. Which entries are removed depends on the implementation of the cache (e.g. sunpy.database.caching.LRUCache, sunpy.database.caching.LFUCache).

show_in_browser(columns=None, sort=False, jsviewer=True)[source]#
star(database_entry, ignore_already_starred=False)[source]#

Mark the given database entry as starred. If this entry is already marked as starred, the behaviour depends on the optional argument ignore_already_starred: if it is False (the default), sunpy.database.EntryAlreadyStarredError is raised. Otherwise, the entry is kept as starred and no exception is raised.

tag(database_entry, *tags)[source]#

Assign the given database entry the given tags.

Raises:
undo(n=1)[source]#

undo the last n commands.

unstar(database_entry, ignore_already_unstarred=False)[source]#

Remove the starred mark of the given entry. If this entry is not marked as starred, the behaviour depends on the optional argument ignore_already_unstarred: if it is False (the default), sunpy.database.EntryAlreadyUnstarredError is raised. Otherwise, the entry is kept as unstarred and no exception is raised.