Records

In addition to the Python modules documented below, note that the directory hepdata/modules/records/static/js contains the JavaScript code that renders the tables and plots in a web browser using the D3.js library.

hepdata.modules.records.api

API for HEPData-Records.

hepdata.modules.records.ext

Jinja utilities for Invenio.

hepdata.modules.records.views

Blueprint for HEPData-Records.

hepdata.modules.records.importer.api

hepdata.modules.records.subscribers.api

HEPData Subscribers API.

hepdata.modules.records.subscribers.models

HEPData Subscribers Model.

hepdata.modules.records.subscribers.rest

hepdata.modules.records.utils.common

hepdata.modules.records.utils.data_processing_utils

hepdata.modules.records.utils.doi_minter

hepdata.modules.records.utils.old_hepdata

hepdata.modules.records.utils.records_update_utils

Update INSPIRE publication information.

hepdata.modules.records.utils.submission

hepdata.modules.records.utils.users

hepdata.modules.records.utils.workflow

hepdata.modules.records.utils.yaml_utils

YAML Processing Utils.

hepdata.modules.records.api

API for HEPData-Records.

hepdata.modules.records.api.returns_json(f)[source]
hepdata.modules.records.api.format_submission(recid, record, version, version_count, hepdata_submission, data_table=None)[source]

Performs all the processing of the record to be displayed.

Parameters
  • recid

  • record

  • version

  • version_count

  • hepdata_submission

  • data_table

Returns

hepdata.modules.records.api.format_tables(ctx, data_record_query, data_table, recid)[source]

Finds all the tables related to a submission and formats them for display in the UI or as JSON.

Returns

hepdata.modules.records.api.get_commit_message(ctx, recid)[source]

Returns a commit message for the current version if present.

Parameters
  • ctx

  • recid

hepdata.modules.records.api.create_breadcrumb_text(authors, ctx, record)[source]

Creates the breadcrumb text for a submission.

hepdata.modules.records.api.submission_has_resources(hepsubmission)[source]

Returns whether the submission has resources attached.

Parameters

hepsubmission – HEPSubmission object

Returns

bool

hepdata.modules.records.api.extract_journal_info(record)[source]
hepdata.modules.records.api.render_record(recid, record, version, output_format, light_mode=False)[source]
hepdata.modules.records.api.has_upload_permissions(recid, user, is_sandbox=False)[source]
hepdata.modules.records.api.has_coordinator_permissions(recid, user, is_sandbox=False)[source]
hepdata.modules.records.api.create_new_version(recid, user, notify_uploader=True, uploader_message=None)[source]
hepdata.modules.records.api.process_payload(recid, file, redirect_url, synchronous=False)[source]

Process an uploaded file

Parameters
  • recid – int The id of the record to update

  • file – file The file to process

  • redirect_url – string Redirect URL to record, for use if the upload fails or in synchronous mode

  • synchronous – bool Whether to process asynchronously via celery (default) or immediately (only recommended for tests)

Returns

JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error).

(task)hepdata.modules.records.api.process_saved_file(file_path, recid, userid, redirect_url, previous_status)[source]
hepdata.modules.records.api.save_zip_file(file, id)[source]
hepdata.modules.records.api.process_zip_archive(file_path, id, old_submission_schema=False, old_data_schema=False)[source]
hepdata.modules.records.api.check_and_convert_from_oldhepdata(input_directory, id, timestamp)[source]

Check if the input directory contains a .oldhepdata file and convert it to YAML if it happens.

hepdata.modules.records.api.move_files(submission_temp_path, submission_path)[source]
hepdata.modules.records.api.query_messages_for_data_review(data_review_record, messages)[source]
hepdata.modules.records.api.assign_or_create_review_status(data_table_metadata, publication_recid, version)[source]

If a review already exists, it will be attached to the current data record. If a review does not exist for a data table, it will be created.

Parameters
  • data_table_metadata – the metadata describing the main table.

  • publication_recid – publication record id

  • version

hepdata.modules.records.api.determine_user_privileges(recid, ctx)[source]
hepdata.modules.records.api.process_data_tables(ctx, data_record_query, first_data_id, data_table=None)[source]
hepdata.modules.records.api.truncate_author_list(record, length=10)[source]
hepdata.modules.records.api.get_all_ids(index=None, id_field='recid', last_updated=None, latest_first=False)[source]

Get all record or inspire ids of publications in the search index

Parameters
  • index – name of index to use.

  • id_field – id type to return. Should be ‘recid’ or ‘inspire_id’

Returns

list of integer ids

hepdata.modules.records.ext

Jinja utilities for Invenio.

class hepdata.modules.records.ext.HEPDataRecords(app=None)[source]

HEPData records extension.

init_app(app)[source]

Flask application initialization.

init_config(app)[source]

Initialize configuration.

setup_app(app)[source]

hepdata.modules.records.views

Blueprint for HEPData-Records.

hepdata.modules.records.views.sandbox_display(id)[source]
hepdata.modules.records.views.get_metadata_by_alternative_id(recid)[source]
hepdata.modules.records.views.submit_question(recid)[source]
hepdata.modules.records.views.notify_participants(recid, version)[source]
hepdata.modules.records.views.notify_coordinator(recid, version)[source]
hepdata.modules.records.views.metadata(recid)[source]

Queries and returns a data record.

Parameters

recid – the record id being queried

Returns

renders the record template

hepdata.modules.records.views.get_count_stats()[source]
hepdata.modules.records.views.get_latest()[source]

Returns the N latest records from the database.

Parameters

n

Returns

hepdata.modules.records.views.get_table_details(recid, data_recid, version)[source]

Get the table details.

Parameters
  • recid

  • data_recid

  • version

Returns

hepdata.modules.records.views.get_coordinator_view(recid)[source]

Returns the coordinator view for a record.

Parameters

recid

hepdata.modules.records.views.set_data_review_status()[source]
hepdata.modules.records.views.get_data_reviews_for_record()[source]

Get the data reviews for a record.

Returns

json response with reviews (or a json with an error key if not)

hepdata.modules.records.views.get_data_review_status()[source]
hepdata.modules.records.views.add_data_review_messsage(publication_recid, data_recid)[source]

Adds a new review message for a data submission.

Parameters
  • publication_recid

  • data_recid

hepdata.modules.records.views.get_review_messages_for_data_table(data_recid, version)[source]
hepdata.modules.records.views.get_all_review_messages(publication_recid)[source]

Gets the review messages for a publication id.

Parameters

publication_recid

Returns

hepdata.modules.records.views.get_resources(recid, version)[source]

Gets a list of resources for a publication, relevant to all data records.

Parameters

recid

Returns

json

hepdata.modules.records.views.process_resource(reference)[source]

For a submission resource, create the link to the location, or the image file if an image.

Parameters

reference

Returns

dict

hepdata.modules.records.views.get_resource(resource_id)[source]

Attempts to find any HTML resources to be displayed for a record in the event that it does not have proper data records included.

Parameters

recid – publication record id

Returns

json dictionary containing any HTML files to show.

hepdata.modules.records.views.cli_upload()[source]

Used by the hepdata-cli tool to upload a submission.

Returns

hepdata.modules.records.views.revise_submission(recid)[source]

This method creates a new version of a submission.

Parameters

recid – record id to attach the data to

Returns

For POST requests, returns JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error). For GET requests, redirects to the record.

hepdata.modules.records.views.consume_data_payload(recid)[source]

This method persists, then presents the loaded data back to the user.

Parameters

recid – record id to attach the data to

Returns

For POST requests, returns JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error). For GET requests, redirects to the record.

hepdata.modules.records.views.sandbox()[source]
hepdata.modules.records.views.attach_information_to_record(recid)[source]

Given an INSPIRE data representation, this will process the data, and update information for a given record id with the contents.

Returns

hepdata.modules.records.views.consume_sandbox_payload()[source]

Creates a new sandbox submission with a new file upload.

Parameters

recid

hepdata.modules.records.views.update_sandbox_payload(recid)[source]

Updates the Sandbox submission with a new file upload.

Parameters

recid

hepdata.modules.records.views.add_resource(type, identifier, version)[source]

Adds a data resource to either the submission or individual data files.

Parameters
  • type

  • identifier

  • version

Returns

hepdata.modules.records.importer.api

hepdata.modules.records.importer.api.import_records(inspire_ids, synchronous=False, update_existing=False, base_url='https://hepdata.net', send_email=False)[source]

Import records from hepdata.net

Parameters
  • inspire_ids – array of inspire ids to load (in the format insXXX).

  • synchronous – if should be run immediately rather than via celery

  • update_existing – whether to update records that already exist

  • base_url – override default base URL

  • send_email – whether to send emails on finalising submissions

Returns

None

hepdata.modules.records.importer.api.get_inspire_ids(base_url='https://hepdata.net', last_updated=None, n_latest=None)[source]

Get inspire IDs from hepdata.net

Parameters
  • last_updated – get IDs of records updated on/after this date

  • n_latest – get the n most recently updated IDs

  • base_url – override default base URL

Returns

list of integer IDs, or False in the case of errors

(task)hepdata.modules.records.importer.api._import_record(inspire_id, update_existing=False, base_url='https://hepdata.net', send_email=False)[source]

hepdata.modules.records.subscribers.api

HEPData Subscribers API.

hepdata.modules.records.subscribers.api.is_current_user_subscribed_to_record(recid)[source]
hepdata.modules.records.subscribers.api.get_users_subscribed_to_record(recid)[source]
hepdata.modules.records.subscribers.api.get_records_subscribed_by_current_user()[source]

hepdata.modules.records.subscribers.models

HEPData Subscribers Model.

class hepdata.modules.records.subscribers.models.Subscribers(**kwargs)[source]

WatchList is the main model for storing the query to be made for a watched query and the user who is watching it.

publication_recid
subscribers

hepdata.modules.records.subscribers.rest

hepdata.modules.records.subscribers.rest.list_subscribers_to_record(recid)[source]
hepdata.modules.records.subscribers.rest.list_subscriptions_for_user()[source]
hepdata.modules.records.subscribers.rest.subscribe(recid)[source]
hepdata.modules.records.subscribers.rest.unsubscribe(recid)[source]

hepdata.modules.records.utils.common

hepdata.modules.records.utils.common.contains_accepted_url(file)[source]
hepdata.modules.records.utils.common.allowed_file(filename)[source]
hepdata.modules.records.utils.common.is_image(filename)[source]
hepdata.modules.records.utils.common.infer_file_type(file)[source]
hepdata.modules.records.utils.common.get_or_create(session, model, **kwargs)[source]
hepdata.modules.records.utils.common.remove_file_extension(filename)[source]
hepdata.modules.records.utils.common.encode_string(string, type='utf-8')[source]
hepdata.modules.records.utils.common.decode_string(string, type='utf-8')[source]
hepdata.modules.records.utils.common.get_license(license_obj)[source]
hepdata.modules.records.utils.common.find_file_in_directory(directory, file_predicate)[source]

Finds a file in a directory. Useful for say when the submission.yaml file is not at the top level of the unzipped archive but one or more levels below.

Parameters
  • directory

  • file_predicate – a lambda that checks if it’s the file you’re looking for

Returns

hepdata.modules.records.utils.common.default_time(obj)[source]

Default JSON serializer.

hepdata.modules.records.utils.common.truncate_string(string, words)[source]
hepdata.modules.records.utils.common.get_record_contents(recid, status=None)[source]

Tries to get record from Elasticsearch first. Failing that, it tries from the database.

Parameters
  • recid – Record ID to get.

  • status – Status of submission. If provided and not ‘finished’, will not check elasticsearch first.

Returns

a dictionary containing the record contents if the recid exists, None otherwise.

hepdata.modules.records.utils.common.get_record_by_id(recid)[source]
hepdata.modules.records.utils.common.record_exists(*args, **kwargs)[source]

hepdata.modules.records.utils.data_processing_utils

hepdata.modules.records.utils.data_processing_utils.pad_independent_variables(table_contents)[source]

Pads out the independent variable column in the event that nothing exists.

Parameters

table_contents

Returns

hepdata.modules.records.utils.data_processing_utils.fix_nan_inf(value)[source]

Converts NaN, +inf, and -inf values to strings.

Parameters

value

Returns

hepdata.modules.records.utils.data_processing_utils.process_independent_variables(table_contents, x_axes, independent_variable_headers)[source]
hepdata.modules.records.utils.data_processing_utils.process_dependent_variables(group_count, record, table_contents, tmp_values, independent_variables, dependent_variable_headers)[source]
hepdata.modules.records.utils.data_processing_utils.generate_table_structure(table_contents)[source]

Creates a renderable structure from the table structure we’ve defined.

Parameters

table_contents

Returns

a dictionary encompassing the qualifiers, headers and values

hepdata.modules.records.utils.data_processing_utils.str_presenter(dumper, data)[source]
hepdata.modules.records.utils.data_processing_utils.process_ctx(ctx, light_mode=False)[source]

hepdata.modules.records.utils.doi_minter

(task)hepdata.modules.records.utils.doi_minter.generate_doi_for_table(doi)[source]

Generate DOI for a specific table given by its doi.

Parameters

doi

Returns

(task)hepdata.modules.records.utils.doi_minter.generate_dois_for_submission(*args, **kwargs)[source]

Generate DOIs for all the submission components.

Parameters
  • args

  • kwargs

Returns

hepdata.modules.records.utils.doi_minter.create_container_doi(hep_submission, data_submissions, publication_info, site_url)[source]

Creates the payload to wrap the whole submission.

Parameters
  • hep_submission

  • data_submissions

  • publication_info

Returns

hepdata.modules.records.utils.doi_minter.create_data_doi(hep_submission, data_submission, publication_info, site_url)[source]

Generate DOI record for a data record.

Parameters
  • data_submission_id

  • version

Returns

hepdata.modules.records.utils.doi_minter.reserve_doi_for_hepsubmission(hepsubmission, update=False)[source]
hepdata.modules.records.utils.doi_minter.reserve_dois_for_data_submissions(*args, **kwargs)[source]

Reserves a DOI for a data submission and saves to the datasubmission object.

Parameters

data_submission – DataSubmission object representing a data table.

Returns

hepdata.modules.records.utils.doi_minter.create_doi(doi)[source]
Parameters

doi – Creates a DOI using the data provider. If it already exists, we return back the existing provider.

Returns

DataCiteProvider

hepdata.modules.records.utils.doi_minter.register_doi(doi, url, xml, uuid)[source]

Given a data submission id, this method takes its assigned DOI, creates the DataCite XML, and registers the DOI.

Parameters
  • data_submissions

  • recid

Returns

hepdata.modules.records.utils.old_hepdata

hepdata.modules.records.utils.old_hepdata.mock_import_old_record(inspire_id='1299143', send_email=False)[source]

Creates a submission but mimics the old migrated paths. (See hepdata master branch at ccd691b for old migrator module.)

hepdata.modules.records.utils.records_update_utils

Update INSPIRE publication information.

(task)hepdata.modules.records.utils.records_update_utils.update_record_info(inspire_id, send_email=False)[source]

Update publication information from INSPIRE for a specific record.

(task)hepdata.modules.records.utils.records_update_utils.update_records_info_since(date)[source]

Update publication information from INSPIRE for all records updated since a certain date.

(task)hepdata.modules.records.utils.records_update_utils.update_records_info_on(date)[source]

Update publication information from INSPIRE for all records updated on a certain date.

(task)hepdata.modules.records.utils.records_update_utils.update_all_records_info()[source]

Update publication information from INSPIRE for all records.

hepdata.modules.records.utils.records_update_utils.get_inspire_records_updated_since(date)[source]

Returns all inspire records updated since YYYY-MM-DD or #int as number of days since today (1 = yesterday)

hepdata.modules.records.utils.records_update_utils.get_inspire_records_updated_on(date)[source]

Returns all inspire records updated on YYYY-MM-DD or #int as number of days since today (1 = yesterday).

hepdata.modules.records.utils.submission

hepdata.modules.records.utils.submission.construct_yaml_str(self, node)[source]
hepdata.modules.records.utils.submission.remove_submission(record_id, version=1)[source]

Removes the database entries and data files related to a record.

Parameters
  • record_id

  • version

Returns

True if Successful, False if the record does not exist.

hepdata.modules.records.utils.submission.cleanup_submission(recid, version, to_keep)[source]

Removes old datasubmission records from the database. This ensures that when users replace a submission, previous records are not left behind in the database.

Parameters
  • recid – publication recid of parent

  • version – version number of record

  • to_keep – an array of names to keep in the submission

Returns

hepdata.modules.records.utils.submission.cleanup_data_resources(data_submission)[source]

Removes additional resources for a datasubmission from the database to avoid duplications. This ensures that when users replace a submission, old resources are not left behind in the database.

Parameters

data_submission – DataSubmission object to be cleaned

Returns

hepdata.modules.records.utils.submission.cleanup_data_keywords(data_submission)[source]

Removes keywords from the database to avoid duplications. This ensures that when users replace a submission, old keywords are not left behind in the database.

Parameters

data_submission – DataSubmission object to be cleaned

Returns

hepdata.modules.records.utils.submission.process_data_file(recid, version, basepath, data_obj, datasubmission, main_file_path)[source]

Takes a data file and any supplementary files and persists their metadata to the database whilst recording their upload path.

Parameters
  • recid – the record id

  • version – version of the resource to be stored

  • basepath – the path the submission has been loaded to

  • data_obj – Object representation of loaded YAML file

  • datasubmission – the DataSubmission object representing this file in the DB

  • main_file_path – the data file path

Returns

hepdata.modules.records.utils.submission.process_general_submission_info(basepath, submission_info_document, recid)[source]

Processes the top level information about a submission, extracting the information about the data abstract, additional resources for the submission (files, links, and html inserts) and historical modification information.

Parameters
  • basepath – the path the submission has been loaded to

  • submission_info_document – the data document

  • recid

Returns

hepdata.modules.records.utils.submission.parse_additional_resources(basepath, recid, yaml_document)[source]

Parses out the additional resource section for a full submission.

Parameters
  • basepath – the path the submission has been loaded to

  • recid

  • yaml_document

Returns

hepdata.modules.records.utils.submission.parse_modifications(hepsubmission, recid, submission_info_document)[source]
hepdata.modules.records.utils.submission.process_submission_directory(basepath, submission_file_path, recid, update=False, old_data_schema=False, old_submission_schema=False)[source]

Goes through an entire submission directory and processes the files within to create DataSubmissions with the files and related material attached as DataResources.

Parameters
  • basepath

  • submission_file_path

  • recid

  • update

  • old_data_schema – whether to use old (v0) data schema

  • old_submission_schema – whether to use old (v0) submission schema (should only be used when importing old records)

Returns

hepdata.modules.records.utils.submission.package_submission(basepath, recid, hep_submission_obj)[source]

Zips up a submission directory. This is in advance of its download for example by users.

Parameters
  • basepath – path of directory containing all submission files

  • recid – the publication record ID

  • hep_submission_obj – the HEPSubmission object representing the overall position

hepdata.modules.records.utils.submission.process_validation_errors_for_display(errors)[source]
hepdata.modules.records.utils.submission.get_or_create_hepsubmission(recid, coordinator=1, status='todo')[source]

Gets or creates a new HEPSubmission record.

Parameters
  • recid – the publication record id

  • coordinator – the user id of the user who owns this record

  • status – e.g. todo, finished.

Returns

the newly created HEPSubmission object

hepdata.modules.records.utils.submission.create_data_review(data_recid, publication_recid, version=1)[source]

Creates a new data review given a data record id and a publication record id.

Parameters
  • data_recid

  • publication_recid

  • version

Returns

hepdata.modules.records.utils.submission.unload_submission(record_id, version=1)[source]
hepdata.modules.records.utils.submission.do_finalise(recid, publication_record=None, force_finalise=False, commit_message=None, send_tweet=False, update=False, convert=True, send_email=True)[source]

Creates record SIP for each data record with a link to the associated publication.

Parameters

synchronous – if true then workflow execution and creation is waited on, then everything is indexed in one go. If False, object creation is asynchronous, however reindexing is not performed. This is only really useful for the full migration of content.

hepdata.modules.records.utils.submission.finalise_datasubmission(current_time, existing_submissions, generated_record_ids, publication_record, recid, submission, version)[source]

hepdata.modules.records.utils.users

hepdata.modules.records.utils.users.get_coordinators_in_system()[source]

Utility function to get all coordinator users in the database.

Returns

list of coordinator ids and emails.

hepdata.modules.records.utils.users.has_role(user, required_role)[source]

Determines if a user has a particular role.

Parameters
  • user – a current_user object

  • required_role – e.g. ‘admin’

Returns

True if the user has the role, False otherwise

hepdata.modules.records.utils.workflow

hepdata.modules.records.utils.workflow.create_data_structure(ctx)[source]

The data structures need to be normalised before being stored in the database. This is performed here.

Parameters

ctx – record information as a dictionary

Returns

a cleaned up representation.

hepdata.modules.records.utils.workflow.update_record(recid, ctx)[source]

Updates a record given a new dictionary.

Parameters
  • recid

  • ctx

Returns

hepdata.modules.records.utils.workflow.create_record(ctx)[source]

Creates the record in the database.

Parameters

ctx – The record metadata as a dictionary.

Returns

the recid and the uuid

hepdata.modules.records.utils.workflow.update_action_for_submission_participant(recid, user_id, action)[source]

hepdata.modules.records.utils.yaml_utils

YAML Processing Utils.

hepdata.modules.records.utils.yaml_utils.write_submission_yaml_block(document, submission_yaml, type='info')[source]
hepdata.modules.records.utils.yaml_utils.split_files(file_location, output_location)[source]
Parameters
  • file_location – input yaml file location

  • output_location – output directory path

hepdata.modules.records.utils.yaml_utils.cleanup_data_yaml(yaml)[source]

Casts strings to numbers where possible.

Parameters

yaml

Returns

hepdata.modules.records.utils.yaml_utils.convert_string_to_numbers(variable_set)[source]
hepdata.modules.records.utils.yaml_utils.cleanup_yaml(yaml, type)[source]
hepdata.modules.records.utils.yaml_utils.add_field_if_needed(yaml, field_name, default_value)[source]
hepdata.modules.records.utils.yaml_utils.remove_keys(yaml, to_remove)[source]
Parameters

yaml

Returns