Database#
- class sunpy.database.Database(url[, CacheClass[, cache_size[, default_waveunit]]])[source]#
Bases:
object
- Parameters:
url (str) – A URL describing the database. This value is simply passed to
sqlalchemy.create_engine()
If not specified the value will be read from the sunpy config file.CacheClass (sunpy.database.caching.BaseCache) – A concrete cache implementation of the abstract class BaseCache. Builtin supported values for this parameters are
sunpy.database.caching.LRUCache
andsunpy.database.caching.LFUCache
. The default value issunpy.database.caching.LRUCache
.cache_size (
int
) – The maximum number of database entries, default is no limit.default_waveunit (
str
orQuantity
, optional) – The wavelength unit that will be used if an entry is added to the database but its wavelength unit cannot be found (either in the file or the VSO query result block, depending on the way the entry was added). If anUnit
is passed, it is assigned todefault_waveunit
. If astr
is passed, it will be converted to aQuantity
through theQuantity
initializer, and then assigned to default_waveunit. If an invalid string is passed,WaveunitNotConvertibleError
is raised. IfNone
(the default), attempting to add an entry without knowing the wavelength unit results in asunpy.database.tables.WaveunitNotFoundError
.
Attributes Summary
The sqlalchemy url of the database instance
Methods Summary
add
(database_entry[, ignore_already_added])Add the given database entry to the database table.
add_from_dir
(path[, recursive, pattern, ...])Search the given directory for FITS files and use their FITS headers to add new entries to the database.
add_from_fido_search_result
(search_result[, ...])Generate database entries from a Fido search result and add all the generated entries to this database.
add_from_file
(file[, ignore_already_added])Generate as many database entries as there are FITS headers in the given file and add them to the database.
add_from_hek_query_result
(query_result[, ...])Add database entries from a HEK query result.
add_from_vso_query_result
(query_result[, ...])Generate database entries from a VSO query result and add all the generated entries to this database.
add_many
(database_entries[, ...])Add a row of database entries "at once".
clear
()Remove all entries from the database.
Clears all entries from the undo and redo history.
commit
()Flush pending changes and commit the current transaction.
display_entries
([columns, sort])download_from_hek_query_result
(query_result)Add new database entries from a hek query result by converting it into vso query and download the corresponding data files.
download_from_vso_query_result
(query_result)Add new database entries from a VSO query result and download the corresponding data files.
edit
(database_entry, **kwargs)Change the given database entry so that it interprets the passed key-value pairs as new values where the keys represent the attributes of this entry.
fetch
(*query, **kwargs)Check if the query has already been used to collect new data.
get_entry_by_id
(entry_id)Get a database entry by its unique ID number.
get_tag
(tag_name)Get the tag which has the given name.
redo
([n])redo the last n commands.
remove
(database_entry)Remove the given database entry from the database table.
remove_many
(database_entries)Remove a row of database entries "at once".
remove_tag
(database_entry, tag_name)Remove the given tag from the database entry.
search
(*query[, sortby])Send the given query to the database and return a list of database entries that satisfy all of the given attributes.
set_cache_size
(cache_size)Set a new value for the maximum number of database entries in the cache.
show_in_browser
([columns, sort, jsviewer])star
(database_entry[, ignore_already_starred])Mark the given database entry as starred.
tag
(database_entry, *tags)Assign the given database entry the given tags.
undo
([n])undo the last n commands.
unstar
(database_entry[, ...])Remove the starred mark of the given entry.
Attributes Documentation
- cache_maxsize#
- cache_size#
- tags#
- url#
The sqlalchemy url of the database instance
Methods Documentation
- add(database_entry, ignore_already_added=False)[source]#
Add the given database entry to the database table.
- Parameters:
database_entry (sunpy.database.tables.DatabaseEntry) – The database entry that will be added to this database.
ignore_already_added (bool, optional) – If True, attempts to add an already existing database entry will result in a
sunpy.database.EntryAlreadyAddedError
. Otherwise, a new entry will be added and there will be duplicates in the database.
- add_from_dir(path, recursive=False, pattern='*', ignore_already_added=False, time_string_parse_format=None)[source]#
Search the given directory for FITS files and use their FITS headers to add new entries to the database. Note that one entry in the database is assigned to a list of FITS headers, so not the number of FITS headers but the number of FITS files which have been read determine the number of database entries that will be added. FITS files are detected by reading the content of each file, the
pattern
argument may be used to avoid reading entire directories if one knows that all FITS files have the same filename extension.- Parameters:
path (str) – The directory where to look for FITS files.
recursive (bool, optional) – If True, the given directory will be searched recursively. Otherwise, only the given directory and no subdirectories are searched. The default is
False
, i.e. the given directory is not searched recursively.pattern (str, optional) – The pattern can be used to filter the list of filenames before the files are attempted to be read. The default is to collect all files. This value is passed to the function
fnmatch.filter()
, see its documentation for more information on the supported syntax.ignore_already_added (bool, optional) – See
sunpy.database.Database.add()
.time_string_parse_format (str, optional) – Fallback timestamp format which will be passed to
strptime
ifsunpy.time.parse_time
is unable to automatically read thedate-obs
metadata.
- add_from_fido_search_result(search_result, ignore_already_added=False)[source]#
Generate database entries from a Fido search result and add all the generated entries to this database.
- Parameters:
search_result (
sunpy.net.fido_factory.UnifiedResponse
) – A UnifiedResponse object that is used to store responses from the unified downloader. This is returned by thesearch
method of asunpy.net.fido_factory.UnifiedDownloaderFactory
object.ignore_already_added (
bool
) – Seesunpy.database.Database.add()
.
- add_from_file(file, ignore_already_added=False)[source]#
Generate as many database entries as there are FITS headers in the given file and add them to the database.
- Parameters:
file (str, file object) – Either a path pointing to a FITS file or an opened file-like object. If an opened file object, its mode must be one of the following rb, rb+, or ab+.
ignore_already_added (bool, optional) – See
sunpy.database.Database.add()
.
- add_from_hek_query_result(query_result, ignore_already_added=False)[source]#
Add database entries from a HEK query result.
- Parameters:
query_result (list) – The value returned by
sunpy.net.hek.HEKClient.search()
ignore_already_added (bool) – See
sunpy.database.Database.add()
.
- add_from_vso_query_result(query_result, ignore_already_added=False)[source]#
Generate database entries from a VSO query result and add all the generated entries to this database.
- Parameters:
query_result (sunpy.net.vso.VSOQueryResponseTable) – A VSO query response that was returned by the
query
method of asunpy.net.vso.VSOClient
object.ignore_already_added (bool) – See
sunpy.database.Database.add()
.
- add_many(database_entries, ignore_already_added=False)[source]#
Add a row of database entries “at once”. If this method is used, only one entry is saved in the undo history.
- Parameters:
database_entries (list) – The list of
DatabaseEntry
that will be added to the database.ignore_already_added (bool, optional) – See Database.add
- clear()[source]#
Remove all entries from the database. This operation can be undone using the
undo()
method.
- commit()[source]#
Flush pending changes and commit the current transaction. This is a shortcut for
sunpy.database.Database.commit()
.
- download_from_hek_query_result(query_result, client=None, path=None, progress=False, ignore_already_added=False, overwrite=False)[source]#
Add new database entries from a hek query result by converting it into vso query and download the corresponding data files.
- Parameters:
query_result (
HEKTable
orHEKRow
) – The value returned bysunpy.net.hek.HEKClient.search()
.client (
sunpy.net.vso.VSOClient
, optional) – VSO Client instance to use for search and download. If not specified a new instance will be created.path (
str
) – Path to download the files.progress (
bool
) – If True, displays the progress bar during file download.ignore_already_added (
bool
) – Seesunpy.database.Database.add()
.overwrite (
bool
, optional) – If True, matching database entries from the query results will be deleted and replaced with new database entries, with all files getting downloaded. Otherwise, no new file download and update of matching database entries takes place.
- download_from_vso_query_result(query_result, client=None, path=None, progress=False, ignore_already_added=False, overwrite=False)[source]#
Add new database entries from a VSO query result and download the corresponding data files. See
sunpy.database.Database.fetch()
for information about the caching mechanism used and about the parametersclient
,path
,progress
.- Parameters:
query_result (sunpy.net.vso.VSOQueryResponseTable) – A VSO query response that was returned by the
query
method of asunpy.net.vso.VSOClient
object.ignore_already_added (bool) – See
sunpy.database.Database.add()
.
- edit(database_entry, **kwargs)[source]#
Change the given database entry so that it interprets the passed key-value pairs as new values where the keys represent the attributes of this entry. If no keywords arguments are given,
ValueError
is raised.
- fetch(*query, **kwargs)[source]#
Check if the query has already been used to collect new data.
If yes, query the database using the method
sunpy.database.Database.search()
and return the result.Otherwise, the retrieved search result is used to download all files that belong to this search result. After that, all the gathered information (the one from the query result and the one from the downloaded files) is added to the database in a way that each header is represented by one database entry.
It uses the
sunpy.database.Database._download_and_collect_entries()
method to download files, which uses query result block level caching. This means that files will not be downloaded for any query result block that had its files downloaded previously. If files for Query A were already downloaded, and then Query B is made which has some result blocks common with Query A, then files for these common blocks will not be downloaded again. Files will only be downloaded for those blocks which are new or haven’t had their files downloaded yet.If querying results in no data, no operation is performed. Concrete, this means that no entry is added to the database and no file is downloaded.
- Parameters:
*query (
list
) – A variable number of attributes that are chained together via the boolean AND operator. The | operator may be used between attributes to express the boolean OR operator.path (
str
, optional) – The directory into which files will be downloaded.overwrite (
bool
, optional) – If True, matching database entries from the query results will be deleted and replaced with new database entries, with all files getting downloaded. Otherwise, no new file download and update of matching database entries takes place.client (
sunpy.net.vso.VSOClient
, optional) – VSO Client instance to use for search and download. If not specified a new instance will be created.progress (
bool
, optional) – If True, displays the progress bar during file download.methods (
str
or iterable ofstr
, optional) – Set VSOClient download method, see`~sunpy.net.vso.VSOClient.fetch` for details.
Examples
This method can be used along with the
overwrite=True
argument to overwrite and redownload files corresponding to the query, even if its entries are already present in the database. Note that theoverwrite=True
argument deletes the old matching database entries and new database entries are added with information from the redownloaded files.>>> from sunpy.database import Database >>> from sunpy.database.tables import display_entries >>> from sunpy.net import vso, attrs as a >>> database = Database('sqlite:///:memory:') >>> database.fetch(a.Time('2012-08-05', '2012-08-05 00:00:05'), ... a.Instrument.aia) >>> print(display_entries(database, ... ['id', 'observation_time_start', 'observation_time_end', ... 'instrument', 'wavemin', 'wavemax'])) id observation_time_start observation_time_end instrument wavemin wavemax --- ---------------------- -------------------- ---------- ------- ------- 1 2012-08-05 00:00:01 2012-08-05 00:00:02 AIA 9.4 9.4 2 2012-08-05 00:00:01 2012-08-05 00:00:02 AIA 9.4 9.4 3 2012-08-05 00:00:02 2012-08-05 00:00:03 AIA 33.5 33.5 4 2012-08-05 00:00:02 2012-08-05 00:00:03 AIA 33.5 33.5 >>> database.fetch(a.Time('2012-08-05', '2012-08-05 00:00:01'), ... a.Instrument.aia, overwrite=True) >>> print(display_entries(database, ... ['id', 'observation_time_start', 'observation_time_end', ... 'instrument', 'wavemin', 'wavemax'])) id observation_time_start observation_time_end instrument wavemin wavemax --- ---------------------- -------------------- ---------- ------- ------- 3 2012-08-05 00:00:02 2012-08-05 00:00:03 AIA 33.5 33.5 4 2012-08-05 00:00:02 2012-08-05 00:00:03 AIA 33.5 33.5 5 2012-08-05 00:00:01 2012-08-05 00:00:02 AIA 9.4 9.4 6 2012-08-05 00:00:01 2012-08-05 00:00:02 AIA 9.4 9.4
Here the first 2 entries (IDs 1 and 2) were overwritten and its files were redownloaded, resulting in the entries with IDs 5 and 6.
- get_entry_by_id(entry_id)[source]#
Get a database entry by its unique ID number. If an entry with the given ID does not exist,
sunpy.database.EntryNotFoundError
is raised.
- get_tag(tag_name)[source]#
Get the tag which has the given name. If no such tag exists,
sunpy.database.NoSuchTagError
is raised.
- remove_many(database_entries)[source]#
Remove a row of database entries “at once”. If this method is used, only one entry is saved in the undo history.
- Parameters:
database_entries (list) – The
DatabaseEntry
that will be removed from the database.
- remove_tag(database_entry, tag_name)[source]#
Remove the given tag from the database entry. If the tag is not connected to any entry after this operation, the tag itself is removed from the database as well.
- Raises:
sunpy.database.NoSuchTagError – If the tag is not connected to the given entry.
- search(*query[, sortby])[source]#
Send the given query to the database and return a list of database entries that satisfy all of the given attributes.
Apart from the attributes supported by the VSO interface, the following attributes are supported:
An important difference to the VSO attributes is that these attributes may also be used in negated form using the tilde ~ operator.
- Parameters:
*query (
list
) – A variable number of attributes that are chained together via the boolean AND operator. The | operator may be used between attributes to express the boolean OR operator.sortby (
str
, optional) – The column by which to sort the returned entries. The default is to sort by the start of the observation. See the attributes ofsunpy.database.tables.DatabaseEntry
for a list of all possible values.
- Returns:
table (
list
) – List ofsunpy.database.tables.DatabaseEntry
objects that satisfy all of the given attributes.- Raises:
TypeError – if no attribute is given or if some keyword argument other than ‘sortby’ is given.
Examples
The query in the following example searches for all non-starred entries with the tag ‘foo’ or ‘bar’ (or both).
>>> database.search(~attrs.Starred(), attrs.Tag('foo') | attrs.Tag('bar'))
- set_cache_size(cache_size)[source]#
Set a new value for the maximum number of database entries in the cache. Use the value
float('inf')
to disable caching. If the new cache is smaller than the previous one and cannot contain all the entries anymore, entries are removed from the cache until the number of entries equals the cache size. Which entries are removed depends on the implementation of the cache (e.g.sunpy.database.caching.LRUCache
,sunpy.database.caching.LFUCache
).
- star(database_entry, ignore_already_starred=False)[source]#
Mark the given database entry as starred. If this entry is already marked as starred, the behaviour depends on the optional argument
ignore_already_starred
: if it isFalse
(the default),sunpy.database.EntryAlreadyStarredError
is raised. Otherwise, the entry is kept as starred and no exception is raised.
- tag(database_entry, *tags)[source]#
Assign the given database entry the given tags.
- Raises:
TypeError – If no tags are given.
sunpy.database.TagAlreadyAssignedError – If at least one of the given tags is already assigned to the given database entry.
- unstar(database_entry, ignore_already_unstarred=False)[source]#
Remove the starred mark of the given entry. If this entry is not marked as starred, the behaviour depends on the optional argument
ignore_already_unstarred
: if it isFalse
(the default),sunpy.database.EntryAlreadyUnstarredError
is raised. Otherwise, the entry is kept as unstarred and no exception is raised.