Adding File hash and number of records to DEMIS EML

Number of Records

A field for Number of Records for a comma delimited data file was not included in the DEIMS 7.  I added it to the Data Source content type as an integer  with a name of 'Number of records' (machine name - field_number_of_records). (Side Note: I actually added the hash output first. After staring at the code for so long I decided to add this also.)

1) modified the eml template eml--node--data-source.tpl.php ((website root)/profiles/deims/modules/custom/eml/templates) to include the <numberOfRecords> element in EML output: The blue is the new code and is added near the end of the file.

  </attributeList>
 <?php if (!empty($content['field_number_of_records'])): ?>
  <numberOfRecords> <?php print render($content['field_number_of_records']); ?>
  </numberOfRecords>

  <?php endif; ?>
  </dataTable>
  <?php endif; ?>

2) modified the eml display ( under admin/structure/types/manage/data-source/display/eml) to unhide the added 'field_number_of_records'.

Hash values

The road to getting hash values in eml was long but in the end it's fairly simple.

1) The easy part:  Download and enable a module call "filehash".  It easily added md5 and shr-1 to all my files.   In phpmyadmin I could see the added table for hashes and I could make a view that would display the hash. Note that the files have to be uploaded by drupal, e.g. by the data source content type, to be in the database and a hash calculated.  So if you use scp to replace the uploaded file then the hash will not be recalculated.

    It was not until I looked hard and long at the eml template for the data-source that I found how to accomplish the output of hashes in eml.

2) Next modify the eml--node--data-source.tpl.php (in (website root)/profiles/deims/modules/custom/eml/templates).  To find the array names of the hashes, I used the devel module - browsing the data source object values.

    The modification is below with the changes in blue.  This bit of code needs to be added in two places, the section of code for comma delimited files, under <dataTable> (near line 55) and the section for <otherEntity>(near line 14).

 <physical>
    <?php if (!empty($entity->field_data_source_file[LANGUAGE_NONE][0])): ?>
    <objectName><?php print check_plain($entity->field_data_source_file[LANGUAGE_NONE][0]['filename']); ?></objectName>
    <size><?php print check_plain($entity->field_data_source_file[LANGUAGE_NONE][0]['filesize']); ?></size>
    <?php endif; ?>
    <?php  /* Check if uploaded file has a hash. The module 'filehash' will generate hashs for files in the drupal system */ ?>
     <?php if (!empty($entity->field_data_source_file[LANGUAGE_NONE][0]['filehash']['md5'])): ?>
    <authentication method="MD5"><?php print check_plain($entity->field_data_source_file[LANGUAGE_NONE][0]['filehash']['md5']); ?></authentication>
    <?php endif; ?>
    <?php if (!empty($entity->field_data_source_file[LANGUAGE_NONE][0]['filehash']['sha1'])): ?>
    <authentication method="SHA-1"><?php print check_plain($entity->field_data_source_file[LANGUAGE_NONE][0]['filehash']['sha1']); ?></authentication>
    <?php endif; ?>

    <dataFormat>

I plan on adding this modification to github but have not done so yet.  However here is a patch file for the eml module.  In looking at the code I see that it could use some streamlining and documentation but no time...

Useage Notes

There a several ways to upload a file.  On my site since I have specific folders where I put my data files I use scp to put the file on my server.  When replacing a file this way the hash is not recalculated.  Therefore if I am replacing a file with one that has the same name I first delete it in drupal (admin/content/file) then I scp the file.  Now in the data source file I will need to re-upload the file.