Difference between revisions of "Media import"

From CollectiveAccess Documentation
Jump to: navigation, search
Line 127: Line 127:
sphinx2022 needs rewrite + screenshots

Latest revision as of 17:41, 19 April 2022

Version 1.5 and above

The media importer allows you to upload entire directories of media at once to CollectiveAccess, either generating new "shell" records for each file that can be completed later by hand or by using the batch editor, or matching each file with existing records, or a combination of both. There are many specific settings, each documented below, that can be used to tailor a media import to a specific project and greatly streamline your workflow.

This tool is accessed under "Import - Media" from the global navigation menu in Providence. Once there, you'll see all of the following options.

Import target

When importing media using this tool or uploading media by hand while cataloging in CA, you are actually creating a new type of record called a "media representation." However, these records are almost always associated with an Object record, or a record from one of the other primary tables, like Entities or Occurrences. Use "Import Target" to set which type of record the media representations should be associated with.

Directory to import

Check the inspector to find out what server directory is associated with the importer tool. By default it is set to your installation's /import folder. However, you can change the directory in app.conf.

You'll see a hierarchy browser reflecting the import directory like this.

Import directory.png

There are two directory settings in this browser:

Setting Description
Include all sub-directories Using the above examples, if I were to import "foo" and check this setting, the files stored in sub-directories "ack", "bar", and "meow" would also be imported. Otherwise, only files directly stored in "foo" would be imported.
Delete media after import If this setting is checked, media will be deleted from the import directory after it is uploaded to CollectiveAccess.

Import Mode

Setting Description
Import all media, matching with existing records where possible This setting will match media with existing records in cases where the media filename matches the record's idno. In other all other cases, new records will be generated for each media file.
Import only media that can be matched with existing records This setting will import only the media that can be matched with an existing record, by finding matches between filename and idno. All other media files will be skipped.
Import all media, creating new records for each This setting will import all media in the chosen directory filepath. Brand new records will be created for each media file.


Setting Description
Type used for newly created object This menu will be populated with the import target's type list. If the target is objects, you can select the specific object type here. The same is true for all other primary tables (entities, occurrences, etc.)
Type used for newly created object representations Use this to set the media representation type. Default values are "front" and "back."


This menu allows you to associate imported records with a set.

Setting Description
Add imported media to set This setting is followed by a drop-down menu populated by the names of existing sets. Use this setting to add all import media to an existing Set.
Create set This setting allows you to create a new set and add all import media records. The setting includes a field to input the title of the new set. This option is particularly useful if you plan on editing these records later using the Batch Editor.
Do not associate imported media with a set Use this setting if none of the imported media should be associated with a Set.

Object identifier

Setting Description
Set object identifier to If creating new records, use this setting to manually set the target record's identifier.
Set object identifier to file name This setting will take the media filename as the record identifier. For example, my_image.jpeg will be used as the Identifier for the object record associated with this image.
Set object identifier to file name without extension This will create (or match) on filename, but without the extension. In other words, my_image.jpeg simply becomes "my_image". This setting is particularly useful when matching on existing records that may have identifiers that match file names, but don't include the file extensions.
Set object identifier to directory and file name If this setting is used, idnos will be set using not only the filename, but also the import directory path. For example, if "my_image.jpeg" is stored in folder called "Media", the record's idno becomes /Media/my_image.jpeg

Status & access

This menu simply allows you to set the "status" and "access" fields for both the "import target" record as well as the representation record.

Setting Description
Set object status to Set new, completed, editing in progress, etc. to the import target status. (If Objects, object status. If Entities, entity status).
Set object access to Set import target access value to "accessible to public" or "not accessible to public."
Set representation status to Set new, completed, editing in progress, etc. to the media representations' status.
Set representation access to Set media representation access value to "accessible to public" or "not accessible to public."

Advanced Options

The media importer includes several advanced options as well. In many cases, you will not need to use these, as the default settings and basic options are sufficient to get the job done most of the time. However, there are a few more tricks...


By default, matching occurs on filename. This setting allows you to set matching on directory name, or directory name, then filename. Additionally, you can limit the matching by type.

Object representation identifier

This setting is similar to the import target identifier setting, only it applies specifically to the media representation record, rather than the import target record.


Some projects have a very structured way of assigning file names to media. A media file name may not only include an identifier for the file itself, but may also include identifiers for authorities or events that are depicted in the file itself. For projects with Entity authorities, for example, it's not uncommon for a media filename to include the entity authority's identifier in the file, if that media happens to depict the entity. For example, let's say I have a photograph of Martha Graham and in my system, her entity idno is "12345." I might name that image file 12345_image.jpeg, to indicate that the image depicts Martha Graham. If this is the case, I can use the "relationships" setting to ensure that the object record associated with the imported image is in fact also related to entity record for Martha Graham (so long as her entity idno is in fact "12345"). Using this setting, not only can you select the related tables, but also the relationship type. In this case, I might choose "depicted."

Skip Files

With this setting, you may use Perl-compatible regular expressions to filter out files in the media directory that you wish to skip. You may also simply list out the filenames of those you wish to skip, one per line.


Setting Description
Allow duplicate media? By default, duplicate media files will be skipped. Use this setting if you wish to override this.
Log level This setting allows you to control the level of detail in the log. The log can capture errors, warnings, alerts, informational messages, and debugging messages. Use the latter for the most comprehensive log.

sphinx2022 needs rewrite + screenshots


Personal tools