Using the database package¶
The database package offers the possibility to save retrieved data (e.g.
via the sunpy.net
package) onto a local or remote database. The
database may be a single file located on a local hard drive (if a SQLite
database is used) or a local or remote database server (see the SQLAlchemy
documentation for a list of supported databases)
This makes it possible to fetch required data from the local database
instead of downloading it again from a remote server.
The package sunpy.database
was developed as part of Google Summer of Code (GSOC) 2013.
1. Connecting and initializing the database¶
To start a connection to an existing or a new database, create
a Database
object:
>>> from sunpy.database import Database
>>> database = Database('sqlite:///sunpydata.sqlite')
The database object in our example above connects to a new SQLite database with the file name “sunpydata.sqlite” in the current directory.
The first parameter of Database
receives one mandatory argument:
a URL which describes how to connect to the database. The supported
format of this URL is described by the documentation of
sqlalchemy.create_engine()
.
Note that a connection is only established when it’s really needed,
i.e. if some query is sent to the database to read from it. Transactions
can also be committed explicitly using the Database.commit()
method.
Warning
If you are using Database
objects in an interactive Python
session you must not forget to call the Database.commit()
method
on them explicitly before quitting the Python session! Otherwise, all
changes on the altered databases are lost!
Note
You can set the default database url in the sunpy config file under the ‘database’ section. See Customizing sunpy for information on the config file. A database section might look like this:
[database]
url = sqlite:////home/user/sunpy/my_database.sqlite
2. Adding new entries¶
Each entry in a database is an instance of the class
tables.DatabaseEntry
with the following attributes:
Attribute |
Description |
---|---|
id |
A unique ID number. By default it is None, but automatically set to the maximum number plus one when an entry is added to the database. |
source |
The source is the name of an observatory or the name of a network of observatories. |
provider |
The name of the server which provides the retrieved data. |
physobs |
A physical observable identifier used by VSO. |
fileid |
The file ID is a string defined by the data provider that should point to a specific data product. The association of fileid to the specific data may change sometime, if the fileid always points to the latest calibrated data. |
observation_time_start |
The date and time when the observation of the data started. |
observation_time_end |
The date and time when the observation of the data ended. |
instrument |
The instrument which was used to observe the data. |
size |
The size of the data in kilobytes (-1 if unknown). |
wavemin |
The value of the measured wave length. |
wavemax |
This is the same value as |
path |
A local file path where the according FITS file is saved. |
download_time |
The date and time when the files connected to a query have been downloaded. Note: this is not the date and time when this entry has been added to a database! |
starred |
Entries can be starred to mark them. By default, this value is False. |
fits_header_entries |
A list of |
fits_key_comments |
A list of |
tags |
A list of |
The
id
attribute is automatically set if an entry is added to a database.The attributes
source
,provider
,physobs
,fileid
,observation_time_start
,observation_time_end
,instrument
,size
,wavemin
, andwavemax
are set by methods which use the VSO interface. In particular, these areDatabase.add_from_vso_query_result()
, and possiblyDatabase.fetch()
.The attributes
path
anddownload_time
are set by the methodDatabase.download
and also possibly byDatabase.fetch()
.starred
is set or changed viaDatabase.star()
orDatabase.unstar()
.tags
is set viaDatabase.tag()
orDatabase.remove_tag()
.The attribute
fits_header_entries
is set by the methodsDatabase.download
,Database.add_from_dir()
, andDatabase.add_from_file()
.
2.1 Adding entries from one FITS file¶
The method Database.add_from_file()
receives one positional argument
(either a string or a file-like object) which is used to add at least one
new entry from the given FITS file to the database. Why “at least one” and
not “exactly one”? The reason is that each database entry does not
represent one file, but one FITS header from a file. That means if you pass
a file which has 5 FITS headers in it, 5 entries will be added to the
database. The file in the following example
(sunpy.data.sample.AIA_171_IMAGE
) has only one FITS header, so just one
entry is added to the database.
Note
If you are working with Hinode/SOT files you may notice that for each file you get two entries, one which refers to the observation and another that contains some (useless) telemetry data.
Database.add_from_file()
saves the value of path
by either simply
passing on the value of the received argument (if it was a string)
or by reading the value of file.name
where file
is the passed argument.
If the path cannot be determined, it stays as None
(the default value).
wavemin
and wavemax
are only set if the wavelength unit
of the FITS file can be found out or if the default_waveunit
attribute of
the database object is set. These values are then
used to convert from the used units to nanometers. The rationale behind
this behaviour is that it makes querying for wavelengths more flexible.
instrument
is set by looking up the
FITS header key INSTRUME. observation_time_start
is set
by searching for the FITS header key DATE-OBS or DATE_OBS.
Analogously, observation_time_end
is set by searching for DATE-END or
DATE_END. Finally, the whole FITS header is stored in the attribute
fits_header_entries
as a list of tables.FitsHeaderEntry
instances, and FITS comments are stored in the attribute
fits_key_comments
which is a list of tables.FitsKeyComment
instances.
Using the function len
on a Database
object returns the
number of saved database entries. To get the first entry of the database,
database[0]
is used (the ID number of the entries does not matter,
database[0]
always returns the oldest saved entry of the database). If
the database is empty, this expression raises an IndexError
.
In section 3, more advanced formats of the slicing syntax are introduced.
>>> import sunpy.data.sample
>>> database.add_from_file(sunpy.data.sample.AIA_171_IMAGE)
>>> len(database)
1
>>> entry = database[0]
>>> entry.path == sunpy.data.sample.AIA_171_IMAGE
True
>>> entry.wavemin, entry.wavemax
(17.1, 17.1)
>>> entry.instrument
'AIA_3'
>>> entry.observation_time_start, entry.observation_time_end
(datetime.datetime(2011, 6, 7, 6, 33, 2, 770000), None)
>>> len(entry.fits_header_entries)
194
>>> for fits_header_entry in entry.fits_header_entries[:10]:
... print('{entry.key}\n {entry.value}'.format(entry=fits_header_entry))
SIMPLE
True
BITPIX
-32
NAXIS
2
NAXIS1
1024
NAXIS2
1024
PCOUNT
0
GCOUNT
1
XTENSION
BINTABLE
BLD_VERS
V5R12X
LVL_NUM
1.5
>>> for fits_key_comment in entry.fits_key_comments:
... print('{comment.key}\n {comment.value}'.format(comment=fits_key_comment))
SIMPLE
conforms to FITS standard
BITPIX
array data type
NAXIS
number of array dimensions
NAXIS1
width of table in bytes
NAXIS2
number of rows in table
PCOUNT
number of group parameters
GCOUNT
number of groups
XTENSION
binary table extension
2.2 Adding entries from a directory of FITS files¶
Adding all FITS files from a directory works by calling the method
Database.add_from_dir()
and passing the desired directory to it. By
setting the keyword argument ignore_already_added
to True
, no
exception is raised if it is attempted to add an already existing entry
In the following example case, setting this parameter is required because the
file sunpy.data.sample.AIA_171_IMAGE
was already added, which is located
in the directory sampledata_dir
>>> from sunpy import config
>>> sampledata_dir = config.get("downloads", "sample_dir")
>>> database.default_waveunit = 'angstrom'
>>> database.add_from_dir(sampledata_dir, ignore_already_added=True,
... time_string_parse_format="%d/%m/%Y")
>>> len(database)
59
2.3 Adding entries using the VSO interface¶
2.3.1 Adding entries from a VSO query result¶
The number of database entries that will be added from a VSO query result
is equal to the value of len(qr)
in the following code snippet. Note that
Database.add_from_vso_query_result()
does not download any files,
though. If you want to add new entries using the VSO and also want to
download files at the same time, take a look at the following two
sections.
>>> from sunpy.net import vso, attrs as a
>>> client = vso.VSOClient()
After initialising the VSO client:
>>> qr = client.search(
... a.Time('2011-05-08', '2011-05-08 00:00:05'),
... a.Instrument.aia)
>>> len(qr)
4
>>> database.add_from_vso_query_result(qr)
>>> len(database)
63
2.3.2 “Clever” Fetching¶
The method Database.fetch()
checks if the given query has already been
used once to add entries to the database. Otherwise, the query is used to
download and add new data. The Database.fetch()
method also accepts an
optional keyword argument path
which is passed as-is to
sunpy.net.vso.VSOClient.fetch()
and determines the value of the path
attribute of each entry.
Note that the number of entries that will be added depends on the total number of FITS headers, not the number of records in the query.
In the next code snippet new data is downloaded as this query has not been downloaded before.
>>> entries = database.fetch(
... a.Time('2012-08-05', '2012-08-05 00:00:05'),
... a.Instrument.aia)
>>> len(database)
67
Any more identical fetch calls don’t add any new entries to the database because they have already been downloaded.
>>> entries = database.fetch(
... a.Time('2012-08-05', '2012-08-05 00:00:05'),
... a.Instrument.aia)
>>> len(database)
67
However, this different fetch call downloads new files because there is a new date range whose files have not been downloaded yet.
>>> entries = database.fetch(
... a.Time('2013-08-05', '2013-08-05 00:00:05'),
... a.Instrument.aia)
>>> len(database)
71
The caching also ensures that when queries have some results in common, files for the common results will not be downloaded again. In the following example, the query is new, but all of it’s files have already been downloaded. This means no new files are downloaded.
>>> entries = database.fetch(
... a.Time('2012-08-05 00:00:00', '2012-08-05 00:00:01'),
... a.Instrument.aia)
>>> len(database)
71
2.4 Adding entries manually¶
Although usually not required, it is also possible to add database entries
by specifying the parameters manually. To do so pass the
values as keyword arguments to tables.DatabaseEntry
as follows:
>>> from sunpy.database.tables import DatabaseEntry
>>> entry = DatabaseEntry(instrument='EIT', wavemin=25.0)
>>> database.add(entry)
>>> entry in database
False
>>> database.commit()
>>> entry in database
True
>>> len(database)
72
Note that the in
operator only works as expected after the
Database.commit()
method has been called!
3. Displaying entries in a table¶
In the previous code snippets 71 entries have been added,
all of them saving a lot of data. To display the database in a table format
there is a helper function. tables.display_entries()
takes two
arguments: the first one is an iterator of tables.DatabaseEntry
instances. Remember that an instance of Database
yields instances
of tables.DatabaseEntry
instances, so you can simply pass a
database object. The second argument is an iterable of the resulting
columns in the table to be displayed. Each string in this iterable is used
to access the entry’s attribute values. In the following example, the
values of entry.id
, entry.observation_time_start
,
entry.observation_time_end
, entry.instrument
, entry.wavemin
,
and entry.wavemax
are displayed (where entry
stands for the
respective database entry). Note that N/A is displayed if the value
cannot be found or is not set.
>>> from sunpy.database.tables import display_entries
>>> print(display_entries(database,
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax']))
id observation_time_start ... wavemin wavemax
--- ---------------------- ... ------------------ ------------------
1 2011-06-07 06:33:02 ... 17.1 17.1
2 2011-06-07 06:33:01 ... 13.100000000000001 13.100000000000001
3 2011-06-07 06:33:02 ... 17.1 17.1
4 2011-06-07 06:33:02 ... 21.1 21.1
5 2011-06-07 06:33:03 ... 33.5 33.5
6 2011-06-07 06:33:05 ... 9.4 9.4
7 2011-06-07 06:33:05 ... 160.0 160.0
8 2011-06-07 06:33:07 ... 19.3 19.3
9 2011-06-07 06:33:07 ... 19.3 19.3
10 2011-06-07 06:39:31 ... 19.3 19.3
... ... ... ... ...
62 2011-05-08 00:00:02 ... 9.4 9.4
63 2011-05-08 00:00:03 ... 33.5 33.5
64 2012-08-05 00:00:01 ... 9.4 9.4
65 2012-08-05 00:00:01 ... 9.4 9.4
66 2012-08-05 00:00:02 ... 33.5 33.5
67 2012-08-05 00:00:02 ... 33.5 33.5
68 2013-08-05 00:00:01 ... 9.4 9.4
69 2013-08-05 00:00:01 ... 9.4 9.4
70 2013-08-05 00:00:02 ... 33.5 33.5
71 2013-08-05 00:00:02 ... 33.5 33.5
72 N/A ... 25.0 N/A
Length = 72 rows
The index operator can be used to access certain single database entries like
database[5]
to get the 5th entry or database[-2]
to get the second-latest entry. It is also possible to
use more advanced slicing syntax;
see https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
for more information.
>>> print(display_entries(database[9::10],
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax']))
id observation_time_start observation_time_end instrument wavemin wavemax
--- ---------------------- -------------------- ---------- ------- -------
10 2011-06-07 06:39:31 N/A AIA_2 19.3 19.3
20 2011-06-06 23:59:55 2011-06-08 00:00:05 GBM N/A N/A
30 N/A N/A N/A N/A N/A
40 N/A N/A N/A N/A N/A
50 N/A N/A N/A N/A N/A
60 2011-05-08 00:00:00 2011-05-08 00:00:01 AIA 17.1 17.1
70 2013-08-05 00:00:02 2013-08-05 00:00:03 AIA 33.5 33.5
4. Removing entries¶
Database.remove()
can be used to remove database entries from the sunpy
database. It takes a tables.DatabaseEntry
object as argument.
For example, imagine we want to only have database entries which have an
observation time saved. To remove all the entries where the value of both
observation_time_start
and observation_time_end
is None
,
simply iterate over the database and the Database.remove()
method to remove those where there is no time set:
>>> for database_entry in database:
... if database_entry.observation_time_start is None and database_entry.observation_time_end is None:
... database.remove(database_entry)
...
>>> len(database)
39
>>> print(display_entries(database,
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax']))
id observation_time_start ... wavemin wavemax
--- ---------------------- ... ------------------ ------------------
1 2011-06-07 06:33:02 ... 17.1 17.1
2 2011-06-07 06:33:01 ... 13.100000000000001 13.100000000000001
3 2011-06-07 06:33:02 ... 17.1 17.1
4 2011-06-07 06:33:02 ... 21.1 21.1
5 2011-06-07 06:33:03 ... 33.5 33.5
6 2011-06-07 06:33:05 ... 9.4 9.4
7 2011-06-07 06:33:05 ... 160.0 160.0
8 2011-06-07 06:33:07 ... 19.3 19.3
9 2011-06-07 06:33:07 ... 19.3 19.3
10 2011-06-07 06:39:31 ... 19.3 19.3
... ... ... ... ...
61 2011-05-08 00:00:00 ... 21.1 21.1
62 2011-05-08 00:00:02 ... 9.4 9.4
63 2011-05-08 00:00:03 ... 33.5 33.5
64 2012-08-05 00:00:01 ... 9.4 9.4
65 2012-08-05 00:00:01 ... 9.4 9.4
66 2012-08-05 00:00:02 ... 33.5 33.5
67 2012-08-05 00:00:02 ... 33.5 33.5
68 2013-08-05 00:00:01 ... 9.4 9.4
69 2013-08-05 00:00:01 ... 9.4 9.4
70 2013-08-05 00:00:02 ... 33.5 33.5
71 2013-08-05 00:00:02 ... 33.5 33.5
Length = 39 rows
5. Editing entries¶
5.1 Starring and unstarring entries¶
The database package supports marking certain entries as “starred” using the
Database.star()
method. For example, to star
all values that have a wavelength of 20nm or higher:
>>> for database_entry in database:
... if database_entry.wavemin and database_entry.wavemin > 20:
... database.star(database_entry)
>>> print(display_entries(
... filter(lambda entry: entry.starred, database),
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax']))
id observation_time_start ... wavemin wavemax
--- ---------------------- ... ----------------- -----------------
4 2011-06-07 06:33:02 ... 21.1 21.1
5 2011-06-07 06:33:03 ... 33.5 33.5
7 2011-06-07 06:33:05 ... 160.0 160.0
16 2011-06-07 06:32:11 ... 617.3000000000001 617.3000000000001
61 2011-05-08 00:00:00 ... 21.1 21.1
63 2011-05-08 00:00:03 ... 33.5 33.5
66 2012-08-05 00:00:02 ... 33.5 33.5
67 2012-08-05 00:00:02 ... 33.5 33.5
70 2013-08-05 00:00:02 ... 33.5 33.5
71 2013-08-05 00:00:02 ... 33.5 33.5
To remove the star from these entries, the Database.unstar()
method
works the same way.
5.3 Manually changing attributes¶
Attributes for entries in the database can be manually edited. The
Database.edit()
method receives the database entry to be edited and any
number of keyword arguments to describe which values to change and how. For
example, to change the time entries that are None
to the start of the
observation time (because it’s possible, not because it is accurate!):
>>> for database_entry in database:
... if database_entry.observation_time_end is None:
... database.edit(database_entry, observation_time_end=database_entry.observation_time_start)
...
>>> print(display_entries(
... database,
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax']))
id observation_time_start ... wavemin wavemax
--- ---------------------- ... ------------------ ------------------
1 2011-06-07 06:33:02 ... 17.1 17.1
2 2011-06-07 06:33:01 ... 13.100000000000001 13.100000000000001
3 2011-06-07 06:33:02 ... 17.1 17.1
4 2011-06-07 06:33:02 ... 21.1 21.1
5 2011-06-07 06:33:03 ... 33.5 33.5
6 2011-06-07 06:33:05 ... 9.4 9.4
7 2011-06-07 06:33:05 ... 160.0 160.0
8 2011-06-07 06:33:07 ... 19.3 19.3
9 2011-06-07 06:33:07 ... 19.3 19.3
10 2011-06-07 06:39:31 ... 19.3 19.3
... ... ... ... ...
61 2011-05-08 00:00:00 ... 21.1 21.1
62 2011-05-08 00:00:02 ... 9.4 9.4
63 2011-05-08 00:00:03 ... 33.5 33.5
64 2012-08-05 00:00:01 ... 9.4 9.4
65 2012-08-05 00:00:01 ... 9.4 9.4
66 2012-08-05 00:00:02 ... 33.5 33.5
67 2012-08-05 00:00:02 ... 33.5 33.5
68 2013-08-05 00:00:01 ... 9.4 9.4
69 2013-08-05 00:00:01 ... 9.4 9.4
70 2013-08-05 00:00:02 ... 33.5 33.5
71 2013-08-05 00:00:02 ... 33.5 33.5
Length = 39 rows
It is possible to use
database_entry.observation_time_end = database_entry.observation_time_start
”,
but it has one major disadvantage: this operation cannot be undone.
See section 6 to see how undoing and redoing works.
6. Undoing and redoing operations¶
A very handy feature of the database package is that every operation that
changes the database in some way can be reverted. The Database class has
the methods Database.undo()
and Database.redo()
to undo
and redo the last n commands, respectively.
Warning
The undo and redo history are only saved in-memory! That means in particular, that if you work on a Database object in an interactive Python session and quit this session, the undo and redo history are lost.
In the following snippet, the operations from the sections 5.3, 5.2, and 5.1 are undone. Note the following changes: there are no longer any tags anymore saved in the database, there is no entry which is starred and the entries with no end of observation time are back.
>>> database.undo(4) # undo the edits from 5.3 (4 records have been affected)
>>> print(display_entries(
... database,
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax', 'tags', 'starred']))
id observation_time_start observation_time_end ... tags starred
--- ---------------------- -------------------- ... ------ -------
1 2011-06-07 06:33:02 2011-06-07 06:33:02 ... N/A No
2 2011-06-07 06:33:01 2011-06-07 06:33:01 ... N/A No
3 2011-06-07 06:33:02 2011-06-07 06:33:02 ... N/A No
4 2011-06-07 06:33:02 2011-06-07 06:33:02 ... N/A Yes
5 2011-06-07 06:33:03 2011-06-07 06:33:03 ... N/A Yes
6 2011-06-07 06:33:05 2011-06-07 06:33:05 ... N/A No
7 2011-06-07 06:33:05 2011-06-07 06:33:05 ... N/A Yes
8 2011-06-07 06:33:07 2011-06-07 06:33:07 ... N/A No
9 2011-06-07 06:33:07 2011-06-07 06:33:07 ... N/A No
10 2011-06-07 06:39:31 2011-06-07 06:39:31 ... N/A No
... ... ... ... ... ...
61 2011-05-08 00:00:00 2011-05-08 00:00:01 ... spring Yes
62 2011-05-08 00:00:02 2011-05-08 00:00:03 ... spring No
63 2011-05-08 00:00:03 2011-05-08 00:00:04 ... spring Yes
64 2012-08-05 00:00:01 2012-08-05 00:00:02 ... N/A No
65 2012-08-05 00:00:01 2012-08-05 00:00:02 ... N/A No
66 2012-08-05 00:00:02 2012-08-05 00:00:03 ... N/A Yes
67 2012-08-05 00:00:02 2012-08-05 00:00:03 ... N/A Yes
68 2013-08-05 00:00:01 2013-08-05 00:00:02 ... N/A No
69 2013-08-05 00:00:01 2013-08-05 00:00:02 ... N/A No
70 2013-08-05 00:00:02 2013-08-05 00:00:03 ... N/A Yes
71 2013-08-05 00:00:02 2013-08-05 00:00:03 ... N/A Yes
Length = 39 rows
The redo()
method reverts the last n operations that have been
undone. If not that many operations can be redone (i.e. any number greater than
14 in this example), an exception is raised. The redo call
reverts the original state: the tags have appeared again and all entries with a
wavelength >20nm are starred again, and there are no entries with no
stored end of observation time.
>>> database.redo(1) # redo all undone operations
>>> print(display_entries(
... database,
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax', 'tags', 'starred']))
id observation_time_start observation_time_end ... tags starred
--- ---------------------- -------------------- ... ------ -------
1 2011-06-07 06:33:02 2011-06-07 06:33:02 ... N/A No
2 2011-06-07 06:33:01 2011-06-07 06:33:01 ... N/A No
3 2011-06-07 06:33:02 2011-06-07 06:33:02 ... N/A No
4 2011-06-07 06:33:02 2011-06-07 06:33:02 ... N/A Yes
5 2011-06-07 06:33:03 2011-06-07 06:33:03 ... N/A Yes
6 2011-06-07 06:33:05 2011-06-07 06:33:05 ... N/A No
7 2011-06-07 06:33:05 2011-06-07 06:33:05 ... N/A Yes
8 2011-06-07 06:33:07 2011-06-07 06:33:07 ... N/A No
9 2011-06-07 06:33:07 2011-06-07 06:33:07 ... N/A No
10 2011-06-07 06:39:31 2011-06-07 06:39:31 ... N/A No
... ... ... ... ... ...
61 2011-05-08 00:00:00 2011-05-08 00:00:01 ... spring Yes
62 2011-05-08 00:00:02 2011-05-08 00:00:03 ... spring No
63 2011-05-08 00:00:03 2011-05-08 00:00:04 ... spring Yes
64 2012-08-05 00:00:01 2012-08-05 00:00:02 ... N/A No
65 2012-08-05 00:00:01 2012-08-05 00:00:02 ... N/A No
66 2012-08-05 00:00:02 2012-08-05 00:00:03 ... N/A Yes
67 2012-08-05 00:00:02 2012-08-05 00:00:03 ... N/A Yes
68 2013-08-05 00:00:01 2013-08-05 00:00:02 ... N/A No
69 2013-08-05 00:00:01 2013-08-05 00:00:02 ... N/A No
70 2013-08-05 00:00:02 2013-08-05 00:00:03 ... N/A Yes
71 2013-08-05 00:00:02 2013-08-05 00:00:03 ... N/A Yes
Length = 39 rows
7. Querying the database¶
The API for querying databases is similar to querying the VSO using the
method sunpy.net.vso.VSOClient.search()
. The Database.search()
method accepts any number of ORed query attributes (using |) and
combines them using AND. It returns a list of matched database entries.
The special thing about querying databases is that all attributes support
the unary operator ~
to negate specific attributes. Example: the query
Instrument.eit
returns all entries that have not been observed
with the EIT.
7.1 Using VSO attributes¶
Using the attributes from sunpy.net.attrs
is quite intuitive:
the simple attributes and the Time attribute work exactly as you expect.
Note
The near
parameter of sunpy.net.attrs.Time
is ignored, because
it’s behaviour is not documented and it is different depending on the
server which is requested.
The following query returns the data that was added in section 2.3.2:
>>> print(display_entries(
... database.search(a.Time('2012-08-05', '2012-08-05 00:00:05'), a.Instrument.aia),
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax'], sort=True))
id observation_time_start observation_time_end instrument wavemin wavemax
--- ---------------------- -------------------- ---------- ------- -------
64 2012-08-05 00:00:01 2012-08-05 00:00:02 AIA 9.4 9.4
65 2012-08-05 00:00:01 2012-08-05 00:00:02 AIA 9.4 9.4
66 2012-08-05 00:00:02 2012-08-05 00:00:03 AIA 33.5 33.5
67 2012-08-05 00:00:02 2012-08-05 00:00:03 AIA 33.5 33.5
When using the sunpy.net.attrs.Wavelength
attribute, you have to
specify a unit using astropy.units.Quantity
. If not an error is
raised. This also means that there is no default unit that is
used by the class. To know how you can specify a detail using astropy
check astropy.units
.
>>> import astropy.units as u
>>> print(display_entries(
... database.search(a.Wavelength(1.0*u.nm, 2.0*u.nm)),
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax'], sort=True))
id observation_time_start ... wavemin wavemax
--- ---------------------- ... ------------------ ------------------
1 2011-06-07 06:33:02 ... 17.1 17.1
10 2011-06-07 06:39:31 ... 19.3 19.3
11 2011-06-07 06:45:55 ... 19.3 19.3
12 2011-06-07 06:52:19 ... 19.3 19.3
13 2011-06-07 06:58:43 ... 19.3 19.3
17 2014-04-09 06:00:12 ... 17.1 17.1
18 2011-06-07 20:37:52 ... 19.5 19.5
2 2011-06-07 06:33:01 ... 13.100000000000001 13.100000000000001
3 2011-06-07 06:33:02 ... 17.1 17.1
58 2011-06-07 06:33:29 ... 17.400000000000002 17.400000000000002
60 2011-05-08 00:00:00 ... 17.1 17.1
8 2011-06-07 06:33:07 ... 19.3 19.3
9 2011-06-07 06:33:07 ... 19.3 19.3
7.2 Database-specific attributes¶
There are 5 additional query attributes supported by the database package.
They can be imported from the submodule sunpy.database.attrs
and are in
particular:
Starred
Tag
Path
DownloadTime
FitsHeaderEntry
The following query searches for all entries that have the tag ‘spring’ or (inclusive or!) are starred and have not the FITS header key ‘WAVEUNIT’ with the value ‘Angstrom’:
>>> import sunpy.database.attrs as dbattrs
>>> print(display_entries(
... database.search(dbattrs.Tag('spring') | dbattrs.Starred(), ~dbattrs.FitsHeaderEntry('WAVEUNIT', 'Angstrom')),
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax', 'tags', 'starred'], sort=True))
id observation_time_start observation_time_end ... tags starred
--- ---------------------- -------------------- ... ------ -------
16 2011-06-07 06:32:11 2011-06-07 06:32:11 ... N/A Yes
17 2014-04-09 06:00:12 2014-04-09 06:00:12 ... spring No
4 2011-06-07 06:33:02 2011-06-07 06:33:02 ... N/A Yes
5 2011-06-07 06:33:03 2011-06-07 06:33:03 ... N/A Yes
60 2011-05-08 00:00:00 2011-05-08 00:00:01 ... spring No
61 2011-05-08 00:00:00 2011-05-08 00:00:01 ... spring Yes
62 2011-05-08 00:00:02 2011-05-08 00:00:03 ... spring No
63 2011-05-08 00:00:03 2011-05-08 00:00:04 ... spring Yes
66 2012-08-05 00:00:02 2012-08-05 00:00:03 ... N/A Yes
67 2012-08-05 00:00:02 2012-08-05 00:00:03 ... N/A Yes
7 2011-06-07 06:33:05 2011-06-07 06:33:05 ... N/A Yes
70 2013-08-05 00:00:02 2013-08-05 00:00:03 ... N/A Yes
71 2013-08-05 00:00:02 2013-08-05 00:00:03 ... N/A Yes
8. Caching¶
All entries that are saved in the database are also saved in a cache
in-memory. The type of the cache is determined at the initialization of
the database object and cannot be changed after that. The default type is
caching.LRUCache
(least-recently used) and the other one which is
supported is caching.LFUCache
(least-frequently used). By
default the cache size is float('inf')
, i.e. infinite. To set the
cache size after the database object has been initialized, use the method
Database.set_cache_size()
. If the new size is smaller than the
current number of database entries, entries are removed according to the
cache type until the number of entries is equal to the given cache size.
The following call to Database.set_cache_size()
sets the cache size
to 10 and therefore removes the 5 entries that been used least recently.
>>> database.set_cache_size(10)
>>> print(display_entries(
... database,
... ['id', 'observation_time_start', 'observation_time_end',
... 'instrument', 'wavemin', 'wavemax']))
id observation_time_start ... wavemin wavemax
--- ---------------------- ... ------------------ ------------------
9 2011-06-07 06:33:07 ... 19.3 19.3
10 2011-06-07 06:39:31 ... 19.3 19.3
11 2011-06-07 06:45:55 ... 19.3 19.3
12 2011-06-07 06:52:19 ... 19.3 19.3
13 2011-06-07 06:58:43 ... 19.3 19.3
16 2011-06-07 06:32:11 ... 617.3000000000001 617.3000000000001
17 2014-04-09 06:00:12 ... 17.1 17.1
18 2011-06-07 20:37:52 ... 19.5 19.5
58 2011-06-07 06:33:29 ... 17.400000000000002 17.400000000000002
59 2011-06-06 00:00:00 ... N/A N/A