Tutorial#

This tutorial gives an introduction on how to use the drms Python library. More detailed information on the different classes and functions can be found in the API Reference.

Basic usage#

We start with looking at data series that are available from JSOC and perform some basic DRMS queries to obtain keyword data (metadata) and segment file (data) locations. This is essentially what you can do on the JSOC Lookdata website.

To be able to access the JSOC DRMS from Python, we first need to import the drms module and create an instance of the Client class:

>>> import drms
>>> client = drms.Client()  

All available data series can be now retrieved by calling drms.Client.series. HMI series names start with "hmi.", AIA series names with "aia." and the names of MDI series with "mdi.".

The first (optional) parameter of this method takes a regular expression that allows you to filter the result. If for example, you want to obtain a list of HMI series, with a name that start with the string "m_", you can write:

>>> client.series(r'hmi\.m_')  
['hmi.M_45s', 'hmi.M_45s_dcon', 'hmi.M_720s', 'hmi.M_720s_dcon', 'hmi.M_720s_dconS', 'hmi.m_720s_mod', 'hmi.m_720s_nrt']

Keep in mind to escape the dot character ('.'), like it is shown in the example above, if you want to include it in your filter string. Also note that series names are handled in a case-insensitive way.

DRMS records can be selected by creating a query string that contains a series name, followed by one or more fields, which are surrounded by square brackets. Each of those fields corresponds to a specific primekey, that is specified in the series definition. A complete set of primekeys represents a unique identifier for a record in that particular series. For more detailed information on building record set queries, including additional non-primekey fields, see the JSOC Help page about this.

With the drms module you can use drms.Client.pkeys to obtain a list of all primekeys of a series:

>>> client.pkeys('hmi.m_720s')  
['T_REC', 'CAMERA']
>>> client.pkeys('hmi.v_sht_modes')  
['T_START', 'LMIN', 'LMAX', 'NDT']

A list of all (regular) keywords can be obtained using drms.Client.keys. You can also use drms.Client.info to get more detailed information about a series:

>>> series_info = client.info('hmi.v_avg120')  
>>> series_info.segments  
        type  units protocol       dims               note
name
mean   short    m/s     fits  4096x4096       Doppler mean
power  short  m2/s2     fits  4096x4096      Doppler power
valid  short     NA     fits  4096x4096  valid pixel count
Log     char     NA  generic                       run log

All table-like structures, returned by routines in the drms module, are Pandas DataFrames. If you are new to Pandas, you should have a look at the introduction to Pandas Data Structures.

Record set queries, used to obtain keyword data and get the location of data segments, can be performed using drms.Client.query. To get, for example, the record time and the mean value for some of the HMI Dopplergrams that were recorded on April 1, 2016, together with the spacecraft’s radial velocity in respect to the Sun, you can write:

>>> query = client.query('hmi.v_45s[2016.04.01_TAI/1d@6h]',
...             key='T_REC, DATAMEAN, OBS_VR')  
>>> query  
                     T_REC     DATAMEAN       OBS_VR
0  2016.04.01_00:00:00_TAI  3313.104980  3309.268006
1  2016.04.01_06:00:00_TAI   878.075195   887.864139
2  2016.04.01_12:00:00_TAI -2289.062500 -2284.690263
3  2016.04.01_18:00:00_TAI   128.609283   137.836168

JSOC time strings can be converted to a naive datetime representation using drms.utils.to_datetime():

>>> timestamps = drms.to_datetime(query.T_REC)  
>>> timestamps  
0   2016-04-01 00:00:00
1   2016-04-01 06:00:00
2   2016-04-01 12:00:00
3   2016-04-01 18:00:00
Name: T_REC, dtype: datetime64[ns]

For most of the HMI and MDI data sets, the TAI time standard is used which, in contrast to UTC, does not make use of any leap seconds. The TAI standard is currently not supported by the Python standard libraries. If you need to convert timestamps between TAI and UTC, you can use Astropy:

>>> from astropy.time import Time
>>> start_time = Time(timestamps[0], format='datetime', scale='tai')  
>>> start_time  
<Time object: scale='tai' format='datetime' value=2016-04-01 00:00:00>
>>> start_time.utc  
<Time object: scale='utc' format='datetime' value=2016-03-31 23:59:24>

The "hmi.v_45s" series has a data segment with the name "Dopplergram", which contains Dopplergrams for each record in the series, that are stored as FITS files. The location of the FITS files for the record set query in the example above, can be obtained by using the seg parameter of drms.Client.query:

>>> query = client.query('hmi.v_45s[2016.04.01_TAI/1d@6h]', seg='Dopplergram')  
>>> query  
                                 Dopplergram
0  /SUM58/D803708321/S00008/Dopplergram.fits
1  /SUM41/D803708361/S00008/Dopplergram.fits
2  /SUM71/D803720859/S00008/Dopplergram.fits
3  /SUM70/D803730119/S00008/Dopplergram.fits

Note that the key and seg parameters can also be used together in one drms.Client.query call:

>>> keys, segments = client.query('hmi.v_45s[2016.04.01_TAI/1d@6h]',
...                key='T_REC, DATAMEAN, OBS_VR', seg='Dopplergram')  

The file paths listed above are the storage location on the JSOC server. You can access these files, even if you do not have direct NFS access to the filesystem, by prepending the JSOC URL to segment file path:

>>> url = 'http://jsoc.stanford.edu' + segments.Dopplergram[0]  
>>> url  
'http://jsoc.stanford.edu/SUM58/D803708321/S00008/Dopplergram.fits'

>>> from astropy.io import fits
>>> data = fits.getdata(url)  
>>> print(data.shape, data.dtype)  
(4096, 4096) float32

Note that FITS files which are accessed in this way, do not contain any keyword data in their headers. This is perfectly fine in many cases, because you can just use drms.Client.query to obtain the data of all required keywords. If you need FITS files with headers that contain all the keyword data, you need to submit an export request to JSOC, which is described in the next section.

Export requests can also be useful, if you want to download more than only one or two files (even without keyword headers), because you can then use drms.ExportRequest.download, which takes care of creating URLs, downloading the data and (if necessary) generating suitable local filenames.

Data export requests#

Data export requests can be interactively built and submitted on the JSOC Export Data webpage, where you can also find more information about the different export options that are available. Note that a registered email address is required to for submitting export requests. You can register your email address on the JSOC email registration webpage.

It is advisable to have a closer look at the export webpage before submitting export requests using the drms library. It is also possible to submit an export request on the webpage and then use the Python routines to query the request status and download files.

Warning

Please replace the email below with your own registered email.

>>> import os
>>> email_address = os.environ["JSOC_EMAIL"]  

First, we start again with importing the drms library and creating a Client instance:

>>> import drms
>>> client = drms.Client(email=email_address)  

In this case we also provide an email address (which needs to be already registered at JSOC).

We now create a download directory for our downloads, in case it does not exist yet:

>>> import os
>>> out_dir = 'downloads'
>>> if not os.path.exists(out_dir):
...     os.mkdir(out_dir)

Data export requests can be submitted using drms.Client.export. The most important parameters of this method, besides the export query string, are the parameters method and protocol. There are many different export methods and protocols available. In the following examples we confine ourselves to the methods url_quick and url and the protocols as-is and fits.

url_quick / as-is#

The most direct and quickest way of downloading files is the combination url_quick / as-is. This (in most cases) does not create an actual export request, where you would have to wait for it being finished, but rather compiles a list of files from your data export query, which can then be directly downloaded. This also means that this kind of export usually has no ExportID assigned to it. The only time it is treated as a “real” export request (including an ExportID and some wait time) is, when the requested data segments are not entirely online, and parts of the requested files need to be restored from tape drives.

As an example, we now create an url_quick / as-is export request for the same record set that was used in the previous section. For export requests, the segment name is specified using an additional field in the query string, surrounded by curly braces. Note that drms.Client.export performs an url_quick / as-is export request by default, so you do not need to explicitly use method='url_quick' and protocol='as-is' in this case:

>>> export_request = client.export('hmi.v_45s[2016.04.01_TAI/1d@6h]{Dopplergram}')  
>>> export_request  
<ExportRequest: id=None, status=0>

>>> export_request.data.filename  
0    /SUM58/D803708321/S00008/Dopplergram.fits
1    /SUM41/D803708361/S00008/Dopplergram.fits
2    /SUM71/D803720859/S00008/Dopplergram.fits
3    /SUM70/D803730119/S00008/Dopplergram.fits
Name: filename, dtype: object

Download URLs can now be generated using the drms.client.ExportRequest.urls attribute:

>>> export_request.urls.url[0]  
'http://jsoc.stanford.edu/SUM58/D803708321/S00008/Dopplergram.fits'

Files can be downloaded using drms.ExportRequest.download. You can (optionally) select which file(s) you want to download, by using the index parameter of this method. The following, for example, only downloads the first file of the request:

>>> export_request.download(out_dir, index=0)  
                                              record                                                url                                           download
0  hmi.V_45s[2016.04.01_00:00:00_TAI][2]{Dopplerg...  http://jsoc.stanford.edu/SUM58/D803708321/S000...  ...

Being a direct as-is export, there are no keyword data written to any FITS headers. If you need keyword data added to the headers, you have to use the fits export protocol instead, which is described below.

url / fits#

Using the fits export protocol, allows you to request FITS files that include all keyword data in their headers. Note that this protocol does not convert other file formats into the FITS format. The only purpose of protocol='fits' is to add keyword data to headers of segment files, that are already stored using the FITS format.

In contrast to url_quick / as-is exports, described in the previous subsection, url / fits exports always create a “real” data export request on the server, which needs to be processed before you can download the requested files. For each request you will get an unique ExportID, which can be accessed using the drms.client.ExportRequest.id attribute. In addition you will get an email notification (including the ExportID), which is sent to your registered email address when the requested files are ready for download.

In the following example, we use the hmi.sharp_720s series, which contains Spaceweather HMI Active Region Patches (SHARPs), and download some data files from this series.

First we have a look at the content of the series, by using drms.Client.info to get a SeriesInfo instance for this particular series:

>>> series_info = client.info('hmi.sharp_720s')  
>>> series_info.note  
'Spaceweather HMI Active Region Patch (SHARP): CCD coordinates'
>>> series_info.primekeys  
['HARPNUM', 'T_REC']

This series contains a total of 31 different data segments:

>>> len(series_info.segments)  
31
>>> series_info.segments.index.values  
array(['magnetogram', 'bitmap', 'Dopplergram', 'continuum', 'inclination',
       'azimuth', 'field', 'vlos_mag', 'dop_width', 'eta_0', 'damping',
       'src_continuum', 'src_grad', 'alpha_mag', 'chisq', 'conv_flag',
       'info_map', 'confid_map', 'inclination_err', 'azimuth_err',
       'field_err', 'vlos_err', 'alpha_err', 'field_inclination_err',
       'field_az_err', 'inclin_azimuth_err', 'field_alpha_err',
       'inclination_alpha_err', 'azimuth_alpha_err', 'disambig',
       'conf_disambig'], dtype=object)

Here, we are only interested in magnetograms and continuum intensity maps:

>>> series_info.segments.loc[['continuum', 'magnetogram']]  
            type  units protocol     dims                 note
name
continuum    int   DN/s     fits  VARxVAR  continuum intensity
magnetogram  int  Gauss     fits  VARxVAR          magnetogram

which are stored as FITS files with varying dimensions.

If we now want to submit an export request for a magnetogram and an intensity map of HARP number 10490, recorded at eight am on December 7th, 2023, we can use the following export query string:

>>> query_string = 'hmi.sharp_720s[10490][2023.12.07_08:00:00_TAI]{continuum}'  

In order to obtain FITS files that include keyword data in their headers, we then need to use protocol='fits' when submitting the request using drms.Client.export:

>>> export_request = client.export(query_string, method='url', protocol='fits')  
>>> export_request  
<ExportRequest: id=JSOC_..., status=2>

We now need to wait for the server to prepare the requested files:

>>> export_request.wait()  
True

>>> export_request.status  
0

Note that calling drms.ExportRequest.wait is optional. It gives you some control over the waiting process, but it can be usually omitted, in which case wait() is called implicitly, when you for example try to download the requested files.

After the export request is finished, a unique request URL is created for you, which points to the location where all your requested files are stored. You can use the drms.client.ExportRequest.request_url attribute to obtain this URL:

>>> export_request.request_url  
'http://jsoc.stanford.edu/.../S00000'

Note that this location is only temporary and that all files will be deleted after a couple of days.

Downloading the data works exactly like in the previous example, by using drms.ExportRequest.download:

>>> export_request.download(out_dir)  
                                              record                                                url                          download
0  warning=No FITS files were exported. The reque...  http://jsoc.stanford.edu/...  /...

If you want to access an existing export request that you have submitted earlier, or if you submitted an export request using the JSOC Export Data webpage. You can use drms.Client.export_from_id with the corresponding ExportID to create an drms.client.ExportRequest instance for this particular request.