# Database¶

class sunpy.database.Database(url[, CacheClass[, cache_size[, default_waveunit]]])[source] [edit on github]

Bases: object

Parameters: url (str) – A URL describing the database. This value is simply passed to sqlalchemy.create_engine() If not specified the value will be read from the sunpy config file. CacheClass (sunpy.database.caching.BaseCache) – A concrete cache implementation of the abstract class BaseCache. Builtin supported values for this parameters are sunpy.database.caching.LRUCache and sunpy.database.caching.LFUCache. The default value is sunpy.database.caching.LRUCache. cache_size (int) – The maximum number of database entries, default is no limit. default_waveunit (str or Unit, optional) – The wavelength unit that will be used if an entry is added to the database but its wavelength unit cannot be found (either in the file or the VSO query result block, depending on the way the entry was added). If an Unit is passed, it is assigned to default_waveunit. If a str is passed, it will be converted to Unit through the astropy.units.Unit() initializer, and then assigned to default_waveunit. If an invalid string is passed, WaveunitNotConvertibleError is raised. If None (the default), attempting to add an entry without knowing the wavelength unit results in a sunpy.database.WaveunitNotFoundError.

Attributes Summary

 cache_maxsize cache_size tags url The sqlalchemy url of the database instance

Methods Summary

 add(database_entry[, ignore_already_added]) Add the given database entry to the database table. add_from_dir(path[, recursive, pattern, …]) Search the given directory for FITS files and use their FITS headers to add new entries to the database. add_from_fido_search_result(search_result[, …]) Generate database entries from a Fido search result and add all the generated entries to this database. add_from_file(file[, ignore_already_added]) Generate as many database entries as there are FITS headers in the given file and add them to the database. add_from_hek_query_result(query_result[, …]) Add database entries from a HEK query result. add_from_vso_query_result(query_result[, …]) Generate database entries from a VSO query result and add all the generated entries to this database. add_many(database_entries[, …]) Add a row of database entries “at once”. clear() Remove all entries from the database. clear_histories() Clears all entries from the undo and redo history. commit() Flush pending changes and commit the current transaction. display_entries([columns, sort]) download(*query, **kwargs) Deprecated since version 0.8. download_from_vso_query_result(query_result) download(query_result, client=sunpy.net.vso.VSOClient(), path=None, progress=False, ignore_already_added=False) edit(database_entry, **kwargs) Change the given database entry so that it interprets the passed key-value pairs as new values where the keys represent the attributes of this entry. fetch(*query[, path, overwrite, client, …) Check if the query has already been used to collect new data. get_entry_by_id(entry_id) Get a database entry by its unique ID number. get_tag(tag_name) Get the tag which has the given name. query(*query, **kwargs) Deprecated since version 0.8. redo([n]) redo the last n commands. remove(database_entry) Remove the given database entry from the database table. remove_many(database_entries) Remove a row of database entries “at once”. remove_tag(database_entry, tag_name) Remove the given tag from the database entry. search(*query[, sortby]) Send the given query to the database and return a list of database entries that satisfy all of the given attributes. set_cache_size(cache_size) Set a new value for the maximum number of database entries in the cache. show_in_browser([columns, sort, jsviewer]) star(database_entry[, ignore_already_starred]) Mark the given database entry as starred. tag(database_entry, *tags) Assign the given database entry the given tags. undo([n]) undo the last n commands. unstar(database_entry[, …]) Remove the starred mark of the given entry.

Attributes Documentation

cache_maxsize
cache_size
tags
url

The sqlalchemy url of the database instance

Methods Documentation

add(database_entry, ignore_already_added=False)[source] [edit on github]

Add the given database entry to the database table.

Parameters: database_entry (sunpy.database.tables.DatabaseEntry) – The database entry that will be added to this database. ignore_already_added (bool, optional) – If True, attempts to add an already existing database entry will result in a sunpy.database.EntryAlreadyAddedError. Otherwise, a new entry will be added and there will be duplicates in the database.
add_from_dir(path, recursive=False, pattern='*', ignore_already_added=False, time_string_parse_format=None)[source] [edit on github]

Search the given directory for FITS files and use their FITS headers to add new entries to the database. Note that one entry in the database is assigned to a list of FITS headers, so not the number of FITS headers but the number of FITS files which have been read determine the number of database entries that will be added. FITS files are detected by reading the content of each file, the pattern argument may be used to avoid reading entire directories if one knows that all FITS files have the same filename extension.

Parameters: path (string) – The directory where to look for FITS files. recursive (bool, optional) – If True, the given directory will be searched recursively. Otherwise, only the given directory and no subdirectories are searched. The default is False, i.e. the given directory is not searched recursively. pattern (string, optional) – The pattern can be used to filter the list of filenames before the files are attempted to be read. The default is to collect all files. This value is passed to the function fnmatch.filter(), see its documentation for more information on the supported syntax. ignore_already_added (bool, optional) – See sunpy.database.Database.add(). time_string_parse_format (str, optional) – Fallback timestamp format which will be passed to strftime if sunpy.time.parse_time is unable to automatically read the date-obs metadata.
add_from_fido_search_result(search_result, ignore_already_added=False)[source] [edit on github]

Generate database entries from a Fido search result and add all the generated entries to this database.

Parameters: search_result (sunpy.net.fido_factory.UnifiedResponse) – A UnifiedResponse object that is used to store responses from the unified downloader. This is returned by the search method of a sunpy.net.fido_factory.UnifiedDownloaderFactory object. ignore_already_added (bool) – See sunpy.database.Database.add().
add_from_file(file, ignore_already_added=False)[source] [edit on github]

Generate as many database entries as there are FITS headers in the given file and add them to the database.

Parameters: file (str or file-like object) – Either a path pointing to a FITS file or a an opened file-like object. If an opened file object, its mode must be one of the following rb, rb+, or ab+. ignore_already_added (bool, optional) – See sunpy.database.Database.add().
add_from_hek_query_result(query_result, ignore_already_added=False)[source] [edit on github]

Add database entries from a HEK query result.

Parameters: query_result (list) – The value returned by sunpy.net.hek.HEKClient().search() ignore_already_added (bool) – See sunpy.database.Database.add().
add_from_vso_query_result(query_result, ignore_already_added=False)[source] [edit on github]

Generate database entries from a VSO query result and add all the generated entries to this database.

Parameters: query_result (sunpy.net.vso.QueryResponse) – A VSO query response that was returned by the query method of a sunpy.net.vso.VSOClient object. ignore_already_added (bool) – See sunpy.database.Database.add().
add_many(database_entries, ignore_already_added=False)[source] [edit on github]

Add a row of database entries “at once”. If this method is used, only one entry is saved in the undo history.

clear()[source] [edit on github]

Remove all entries from the database. This operation can be undone using the undo() method.

clear_histories()[source] [edit on github]

Clears all entries from the undo and redo history.

commit()[source] [edit on github]

Flush pending changes and commit the current transaction. This is a shortcut for session.commit().

display_entries(columns=None, sort=False)[source] [edit on github]
download(*query, **kwargs)[source] [edit on github]

Deprecated since version 0.8: The download function is deprecated and may be removed in a future version. Use database.fetch() instead.

download_from_vso_query_result(query_result, client=None, path=None, progress=False, ignore_already_added=False, overwrite=False)[source] [edit on github]

Add new database entries from a VSO query result and download the corresponding data files. See sunpy.database.Database.download() for information about the caching mechanism used and about the parameters client, path, progress.

Parameters: query_result (sunpy.net.vso.QueryResponse) – A VSO query response that was returned by the query method of a sunpy.net.vso.VSOClient object. ignore_already_added (bool) – See sunpy.database.Database.add().
edit(database_entry, **kwargs)[source] [edit on github]

Change the given database entry so that it interprets the passed key-value pairs as new values where the keys represent the attributes of this entry. If no keywords arguments are given, ValueError is raised.

fetch(*query[, path, overwrite, client, progress, methods])[source] [edit on github]

Check if the query has already been used to collect new data.

If yes, query the database using the method sunpy.database.Database.search() and return the result.

Otherwise, the retrieved search result is used to download all files that belong to this search result. After that, all the gathered information (the one from the query result and the one from the downloaded files) is added to the database in a way that each header is represented by one database entry.

It uses the sunpy.database.Database._download_and_collect_entries() method to download files, which uses query result block level caching. This means that files will not be downloaded for any query result block that had its files downloaded previously. If files for Query A were already downloaded, and then Query B is made which has some result blocks common with Query A, then files for these common blocks will not be downloaded again. Files will only be downloaded for those blocks which are new or haven’t had their files downloaded yet.

If querying results in no data, no operation is performed. Concrete, this means that no entry is added to the database and no file is downloaded.

Parameters: query (list) – A variable number of attributes that are chained together via the boolean AND operator. The | operator may be used between attributes to express the boolean OR operator. path (str, optional) – The directory into which files will be downloaded. overwrite (bool, optional) – If True, matching database entries from the query results will be deleted and replaced with new database entries, with all files getting downloaded. Otherwise, no new file download and update of matching database entries takes place. client (sunpy.net.vso.VSOClient, optional) – VSO Client instance to use for search and download. If not specified a new instance will be created. progress (bool, optional) – If True, displays the progress bar during file download. methods (str or iterable of str, optional) – Set VSOClient download method, see~sunpy.net.vso.VSOClient.get for details.

Examples

The fetch method can be used along with the overwrite=True argument to overwrite and redownload files corresponding to the query, even if its entries are already present in the database. Note that the overwrite=True argument deletes the old matching database entries and new database entries are added with information from the redownloaded files.

>>> from sunpy.database import Database
>>> from sunpy.database.tables import display_entries
>>> from sunpy.net import vso
>>> database = Database('sqlite:///:memory:')
>>> database.fetch(vso.attrs.Time('2012-08-05', '2012-08-05 00:00:05'),
...                vso.attrs.Instrument('AIA'))
>>> print(display_entries(database,
...                       ['id', 'observation_time_start', 'observation_time_end',
...                        'instrument', 'wavemin', 'wavemax']))
id observation_time_start observation_time_end instrument wavemin wavemax
--- ---------------------- -------------------- ---------- ------- -------
1    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4
2    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4
3    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
4    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
>>> database.fetch(vso.attrs.Time('2012-08-05', '2012-08-05 00:00:01'),
...                vso.attrs.Instrument('AIA'), overwrite=True)
>>> print(display_entries(database,
...                       ['id', 'observation_time_start', 'observation_time_end',
...                        'instrument', 'wavemin', 'wavemax']))
id observation_time_start observation_time_end instrument wavemin wavemax
--- ---------------------- -------------------- ---------- ------- -------
3    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
4    2012-08-05 00:00:02  2012-08-05 00:00:03        AIA    33.5    33.5
5    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4
6    2012-08-05 00:00:01  2012-08-05 00:00:02        AIA     9.4     9.4


Here the first 2 entries (IDs 1 and 2) were overwritten and its files were redownloaded, resulting in the entries with IDs 5 and 6.

get_entry_by_id(entry_id)[source] [edit on github]

Get a database entry by its unique ID number. If an entry with the given ID does not exist, sunpy.database.EntryNotFoundError is raised.

get_tag(tag_name)[source] [edit on github]

Get the tag which has the given name. If no such tag exists, sunpy.database.NoSuchTagError is raised.

query(*query, **kwargs)[source] [edit on github]

Deprecated since version 0.8: The query function is deprecated and may be removed in a future version. Use database.search instead.

redo(n=1)[source] [edit on github]

redo the last n commands.

remove(database_entry)[source] [edit on github]

Remove the given database entry from the database table.

remove_many(database_entries)[source] [edit on github]

Remove a row of database entries “at once”. If this method is used, only one entry is saved in the undo history.

Parameters: database_entries (iterable of sunpy.database.tables.DatabaseEntry) – The database entries that will be removed from the database.
remove_tag(database_entry, tag_name)[source] [edit on github]

Remove the given tag from the database entry. If the tag is not connected to any entry after this operation, the tag itself is removed from the database as well.

Raises: sunpy.database.NoSuchTagError – If the tag is not connected to the given entry.
search(*query[, sortby])[source] [edit on github]

Send the given query to the database and return a list of database entries that satisfy all of the given attributes.

Apart from the attributes supported by the VSO interface, the following attributes are supported:

An important difference to the VSO attributes is that these attributes may also be used in negated form using the tilde ~ operator.

Parameters: query (list) – A variable number of attributes that are chained together via the boolean AND operator. The | operator may be used between attributes to express the boolean OR operator. sortby (str, optional) – The column by which to sort the returned entries. The default is to sort by the start of the observation. See the attributes of sunpy.database.tables.DatabaseEntry for a list of all possible values. TypeError – if no attribute is given or if some keyword argument other than ‘sortby’ is given.

Examples

The query in the following example searches for all non-starred entries with the tag ‘foo’ or ‘bar’ (or both).

>>> database.search(~attrs.Starred(), attrs.Tag('foo') | attrs.Tag('bar'))

set_cache_size(cache_size)[source] [edit on github]

Set a new value for the maximum number of database entries in the cache. Use the value float('inf') to disable caching. If the new cache is smaller than the previous one and cannot contain all the entries anymore, entries are removed from the cache until the number of entries equals the cache size. Which entries are removed depends on the implementation of the cache (e.g. sunpy.database.caching.LRUCache, sunpy.database.caching.LFUCache).

show_in_browser(columns=None, sort=False, jsviewer=True)[source] [edit on github]
star(database_entry, ignore_already_starred=False)[source] [edit on github]

Mark the given database entry as starred. If this entry is already marked as starred, the behaviour depends on the optional argument ignore_already_starred: if it is False (the default), sunpy.database.EntryAlreadyStarredError is raised. Otherwise, the entry is kept as starred and no exception is raised.

tag(database_entry, *tags)[source] [edit on github]

Assign the given database entry the given tags.

Raises: TypeError – If no tags are given. sunpy.database.TagAlreadyAssignedError – If at least one of the given tags is already assigned to the given database entry.
undo(n=1)[source] [edit on github]

undo the last n commands.

unstar(database_entry, ignore_already_unstarred=False)[source] [edit on github]

Remove the starred mark of the given entry. If this entry is not marked as starred, the behaviour depends on the optional argument ignore_already_unstarred: if it is False (the default), sunpy.database.EntryAlreadyUnstarredError is raised. Otherwise, the entry is kept as unstarred and no exception is raised.