Records¶
In addition to the Python modules documented below, note that the directory hepdata/modules/records/static/js
contains the JavaScript code that renders the tables and plots in a web browser using the
D3.js library.
API for HEPData-Records. |
|
Jinja utilities for Invenio. |
|
Blueprint for HEPData-Records. |
|
HEPData Subscribers API. |
|
HEPData Subscribers Model. |
|
Update INSPIRE publication information. |
|
YAML Processing Utils. |
hepdata.modules.records.api¶
API for HEPData-Records.
- hepdata.modules.records.api.format_submission(recid, record, version, version_count, hepdata_submission, data_table=None)[source]¶
Performs all the processing of the record to be displayed.
- Parameters
recid –
record –
version –
version_count –
hepdata_submission –
data_table –
- Returns
- hepdata.modules.records.api.format_tables(ctx, data_record_query, data_table, recid)[source]¶
Finds all the tables related to a submission and formats them for display in the UI or as JSON.
- Returns
- hepdata.modules.records.api.get_commit_message(ctx, recid)[source]¶
Returns a commit message for the current version if present.
- Parameters
ctx –
recid –
- hepdata.modules.records.api.create_breadcrumb_text(authors, ctx, record)[source]¶
Creates the breadcrumb text for a submission.
- hepdata.modules.records.api.submission_has_resources(hepsubmission)[source]¶
Returns whether the submission has resources attached.
- Parameters
hepsubmission – HEPSubmission object
- Returns
bool
- hepdata.modules.records.api.render_record(recid, record, version, output_format, light_mode=False)[source]¶
- hepdata.modules.records.api.create_new_version(recid, user, notify_uploader=True, uploader_message=None)[source]¶
- hepdata.modules.records.api.process_payload(recid, file, redirect_url, synchronous=False)[source]¶
Process an uploaded file
- Parameters
recid – int The id of the record to update
file – file The file to process
redirect_url – string Redirect URL to record, for use if the upload fails or in synchronous mode
synchronous – bool Whether to process asynchronously via celery (default) or immediately (only recommended for tests)
- Returns
JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error).
- (task)hepdata.modules.records.api.process_saved_file(file_path, recid, userid, redirect_url, previous_status)[source]¶
- hepdata.modules.records.api.process_zip_archive(file_path, id, old_submission_schema=False, old_data_schema=False)[source]¶
- hepdata.modules.records.api.check_and_convert_from_oldhepdata(input_directory, id, timestamp)[source]¶
Check if the input directory contains a .oldhepdata file and convert it to YAML if it happens.
- hepdata.modules.records.api.assign_or_create_review_status(data_table_metadata, publication_recid, version)[source]¶
If a review already exists, it will be attached to the current data record. If a review does not exist for a data table, it will be created.
- Parameters
data_table_metadata – the metadata describing the main table.
publication_recid – publication record id
version –
- hepdata.modules.records.api.process_data_tables(ctx, data_record_query, first_data_id, data_table=None)[source]¶
- hepdata.modules.records.api.get_all_ids(index=None, id_field='recid', last_updated=None, latest_first=False)[source]¶
Get all record or inspire ids of publications in the search index
- Parameters
index – name of index to use.
id_field – id type to return. Should be ‘recid’ or ‘inspire_id’
- Returns
list of integer ids
hepdata.modules.records.ext¶
Jinja utilities for Invenio.
hepdata.modules.records.views¶
Blueprint for HEPData-Records.
- hepdata.modules.records.views.metadata(recid)[source]¶
Queries and returns a data record.
- Parameters
recid – the record id being queried
- Returns
renders the record template
- hepdata.modules.records.views.get_latest()[source]¶
Returns the N latest records from the database.
- Parameters
n –
- Returns
- hepdata.modules.records.views.get_table_details(recid, data_recid, version)[source]¶
Get the table details.
- Parameters
recid –
data_recid –
version –
- Returns
- hepdata.modules.records.views.get_coordinator_view(recid)[source]¶
Returns the coordinator view for a record.
- Parameters
recid –
- hepdata.modules.records.views.get_data_reviews_for_record()[source]¶
Get the data reviews for a record.
- Returns
json response with reviews (or a json with an error key if not)
- hepdata.modules.records.views.add_data_review_messsage(publication_recid, data_recid)[source]¶
Adds a new review message for a data submission.
- Parameters
publication_recid –
data_recid –
- hepdata.modules.records.views.get_all_review_messages(publication_recid)[source]¶
Gets the review messages for a publication id.
- Parameters
publication_recid –
- Returns
- hepdata.modules.records.views.get_resources(recid, version)[source]¶
Gets a list of resources for a publication, relevant to all data records.
- Parameters
recid –
- Returns
json
- hepdata.modules.records.views.process_resource(reference)[source]¶
For a submission resource, create the link to the location, or the image file if an image.
- Parameters
reference –
- Returns
dict
- hepdata.modules.records.views.get_resource(resource_id)[source]¶
Attempts to find any HTML resources to be displayed for a record in the event that it does not have proper data records included.
- Parameters
recid – publication record id
- Returns
json dictionary containing any HTML files to show.
- hepdata.modules.records.views.cli_upload()[source]¶
Used by the hepdata-cli tool to upload a submission.
- Returns
- hepdata.modules.records.views.revise_submission(recid)[source]¶
This method creates a new version of a submission.
- Parameters
recid – record id to attach the data to
- Returns
For POST requests, returns JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error). For GET requests, redirects to the record.
- hepdata.modules.records.views.consume_data_payload(recid)[source]¶
This method persists, then presents the loaded data back to the user.
- Parameters
recid – record id to attach the data to
- Returns
For POST requests, returns JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error). For GET requests, redirects to the record.
- hepdata.modules.records.views.attach_information_to_record(recid)[source]¶
Given an INSPIRE data representation, this will process the data, and update information for a given record id with the contents.
- Returns
- hepdata.modules.records.views.consume_sandbox_payload()[source]¶
Creates a new sandbox submission with a new file upload.
- Parameters
recid –
hepdata.modules.records.importer.api¶
- hepdata.modules.records.importer.api.import_records(inspire_ids, synchronous=False, update_existing=False, base_url='https://hepdata.net', send_email=False)[source]¶
Import records from hepdata.net
- Parameters
inspire_ids – array of inspire ids to load (in the format insXXX).
synchronous – if should be run immediately rather than via celery
update_existing – whether to update records that already exist
base_url – override default base URL
send_email – whether to send emails on finalising submissions
- Returns
None
- hepdata.modules.records.importer.api.get_inspire_ids(base_url='https://hepdata.net', last_updated=None, n_latest=None)[source]¶
Get inspire IDs from hepdata.net
- Parameters
last_updated – get IDs of records updated on/after this date
n_latest – get the n most recently updated IDs
base_url – override default base URL
- Returns
list of integer IDs, or False in the case of errors
hepdata.modules.records.subscribers.api¶
HEPData Subscribers API.
hepdata.modules.records.subscribers.models¶
HEPData Subscribers Model.
hepdata.modules.records.subscribers.rest¶
hepdata.modules.records.utils.common¶
- hepdata.modules.records.utils.common.find_file_in_directory(directory, file_predicate)[source]¶
Finds a file in a directory. Useful for say when the submission.yaml file is not at the top level of the unzipped archive but one or more levels below.
- Parameters
directory –
file_predicate – a lambda that checks if it’s the file you’re looking for
- Returns
- hepdata.modules.records.utils.common.get_record_contents(recid, status=None)[source]¶
Tries to get record from Elasticsearch first. Failing that, it tries from the database.
- Parameters
recid – Record ID to get.
status – Status of submission. If provided and not ‘finished’, will not check elasticsearch first.
- Returns
a dictionary containing the record contents if the recid exists, None otherwise.
hepdata.modules.records.utils.data_processing_utils¶
- hepdata.modules.records.utils.data_processing_utils.pad_independent_variables(table_contents)[source]¶
Pads out the independent variable column in the event that nothing exists.
- Parameters
table_contents –
- Returns
- hepdata.modules.records.utils.data_processing_utils.fix_nan_inf(value)[source]¶
Converts NaN, +inf, and -inf values to strings.
- Parameters
value –
- Returns
- hepdata.modules.records.utils.data_processing_utils.process_independent_variables(table_contents, x_axes, independent_variable_headers)[source]¶
- hepdata.modules.records.utils.data_processing_utils.process_dependent_variables(group_count, record, table_contents, tmp_values, independent_variables, dependent_variable_headers)[source]¶
hepdata.modules.records.utils.doi_minter¶
- (task)hepdata.modules.records.utils.doi_minter.generate_doi_for_table(doi)[source]¶
Generate DOI for a specific table given by its doi.
- Parameters
doi –
- Returns
- (task)hepdata.modules.records.utils.doi_minter.generate_dois_for_submission(*args, **kwargs)[source]¶
Generate DOIs for all the submission components.
- Parameters
args –
kwargs –
- Returns
- hepdata.modules.records.utils.doi_minter.create_container_doi(hep_submission, data_submissions, publication_info, site_url)[source]¶
Creates the payload to wrap the whole submission.
- Parameters
hep_submission –
data_submissions –
publication_info –
- Returns
- hepdata.modules.records.utils.doi_minter.create_data_doi(hep_submission, data_submission, publication_info, site_url)[source]¶
Generate DOI record for a data record.
- Parameters
data_submission_id –
version –
- Returns
- hepdata.modules.records.utils.doi_minter.reserve_doi_for_hepsubmission(hepsubmission, update=False)[source]¶
- hepdata.modules.records.utils.doi_minter.reserve_dois_for_data_submissions(*args, **kwargs)[source]¶
Reserves a DOI for a data submission and saves to the datasubmission object.
- Parameters
data_submission – DataSubmission object representing a data table.
- Returns
hepdata.modules.records.utils.old_hepdata¶
hepdata.modules.records.utils.records_update_utils¶
Update INSPIRE publication information.
- (task)hepdata.modules.records.utils.records_update_utils.update_record_info(inspire_id, send_email=False)[source]¶
Update publication information from INSPIRE for a specific record.
- (task)hepdata.modules.records.utils.records_update_utils.update_records_info_since(date)[source]¶
Update publication information from INSPIRE for all records updated since a certain date.
- (task)hepdata.modules.records.utils.records_update_utils.update_records_info_on(date)[source]¶
Update publication information from INSPIRE for all records updated on a certain date.
- (task)hepdata.modules.records.utils.records_update_utils.update_all_records_info()[source]¶
Update publication information from INSPIRE for all records.
hepdata.modules.records.utils.submission¶
- hepdata.modules.records.utils.submission.remove_submission(record_id, version=1)[source]¶
Removes the database entries and data files related to a record.
- Parameters
record_id –
version –
- Returns
True if Successful, False if the record does not exist.
- hepdata.modules.records.utils.submission.cleanup_submission(recid, version, to_keep)[source]¶
Removes old datasubmission records from the database. This ensures that when users replace a submission, previous records are not left behind in the database.
- Parameters
recid – publication recid of parent
version – version number of record
to_keep – an array of names to keep in the submission
- Returns
- hepdata.modules.records.utils.submission.cleanup_data_resources(data_submission)[source]¶
Removes additional resources for a datasubmission from the database to avoid duplications. This ensures that when users replace a submission, old resources are not left behind in the database.
- Parameters
data_submission – DataSubmission object to be cleaned
- Returns
- hepdata.modules.records.utils.submission.cleanup_data_keywords(data_submission)[source]¶
Removes keywords from the database to avoid duplications. This ensures that when users replace a submission, old keywords are not left behind in the database.
- Parameters
data_submission – DataSubmission object to be cleaned
- Returns
- hepdata.modules.records.utils.submission.process_data_file(recid, version, basepath, data_obj, datasubmission, main_file_path)[source]¶
Takes a data file and any supplementary files and persists their metadata to the database whilst recording their upload path.
- Parameters
recid – the record id
version – version of the resource to be stored
basepath – the path the submission has been loaded to
data_obj – Object representation of loaded YAML file
datasubmission – the DataSubmission object representing this file in the DB
main_file_path – the data file path
- Returns
- hepdata.modules.records.utils.submission.process_general_submission_info(basepath, submission_info_document, recid)[source]¶
Processes the top level information about a submission, extracting the information about the data abstract, additional resources for the submission (files, links, and html inserts) and historical modification information.
- Parameters
basepath – the path the submission has been loaded to
submission_info_document – the data document
recid –
- Returns
- hepdata.modules.records.utils.submission.parse_additional_resources(basepath, recid, yaml_document)[source]¶
Parses out the additional resource section for a full submission.
- Parameters
basepath – the path the submission has been loaded to
recid –
yaml_document –
- Returns
- hepdata.modules.records.utils.submission.parse_modifications(hepsubmission, recid, submission_info_document)[source]¶
- hepdata.modules.records.utils.submission.process_submission_directory(basepath, submission_file_path, recid, update=False, old_data_schema=False, old_submission_schema=False)[source]¶
Goes through an entire submission directory and processes the files within to create DataSubmissions with the files and related material attached as DataResources.
- Parameters
basepath –
submission_file_path –
recid –
update –
old_data_schema – whether to use old (v0) data schema
old_submission_schema – whether to use old (v0) submission schema (should only be used when importing old records)
- Returns
- hepdata.modules.records.utils.submission.package_submission(basepath, recid, hep_submission_obj)[source]¶
Zips up a submission directory. This is in advance of its download for example by users.
- Parameters
basepath – path of directory containing all submission files
recid – the publication record ID
hep_submission_obj – the HEPSubmission object representing the overall position
- hepdata.modules.records.utils.submission.get_or_create_hepsubmission(recid, coordinator=1, status='todo')[source]¶
Gets or creates a new HEPSubmission record.
- Parameters
recid – the publication record id
coordinator – the user id of the user who owns this record
status – e.g. todo, finished.
- Returns
the newly created HEPSubmission object
- hepdata.modules.records.utils.submission.create_data_review(data_recid, publication_recid, version=1)[source]¶
Creates a new data review given a data record id and a publication record id.
- Parameters
data_recid –
publication_recid –
version –
- Returns
- hepdata.modules.records.utils.submission.do_finalise(recid, publication_record=None, force_finalise=False, commit_message=None, send_tweet=False, update=False, convert=True, send_email=True)[source]¶
Creates record SIP for each data record with a link to the associated publication.
- Parameters
synchronous – if true then workflow execution and creation is waited on, then everything is indexed in one go. If False, object creation is asynchronous, however reindexing is not performed. This is only really useful for the full migration of content.
hepdata.modules.records.utils.users¶
hepdata.modules.records.utils.workflow¶
- hepdata.modules.records.utils.workflow.create_data_structure(ctx)[source]¶
The data structures need to be normalised before being stored in the database. This is performed here.
- Parameters
ctx – record information as a dictionary
- Returns
a cleaned up representation.
- hepdata.modules.records.utils.workflow.update_record(recid, ctx)[source]¶
Updates a record given a new dictionary.
- Parameters
recid –
ctx –
- Returns
hepdata.modules.records.utils.yaml_utils¶
YAML Processing Utils.
- hepdata.modules.records.utils.yaml_utils.write_submission_yaml_block(document, submission_yaml, type='info')[source]¶
- hepdata.modules.records.utils.yaml_utils.split_files(file_location, output_location)[source]¶
- Parameters
file_location – input yaml file location
output_location – output directory path
- hepdata.modules.records.utils.yaml_utils.cleanup_data_yaml(yaml)[source]¶
Casts strings to numbers where possible.
- Parameters
yaml –
- Returns