Using the Dutch AAT

From CollectiveAccess Documentation
Jump to: navigation, search

Using the Dutch version of the Getty Art & Architecture Thesaurus (AAT)

The Netherlands Institute for Art History has translated the Getty Art & Architecture Thesaurus into Dutch. You can learn more about the product on their website.

From the Getty Research Institute web site:

CollectiveAccess supports the Dutch version in the same way as it supports the English version. Since the file format for the Dutch AAT is different than that used currently for the Getty English version, a different import script is used to import this data.

Obtaining and Importing the AAT File Set

You can obtain the Dutch AAT data set from the web site at http://www.aat-ned.nl/index.html.

The import utility is a PHP script named import_aat.php located in support/import/aat_dutch/. After downloading the XML data set, decompress it, rename the main data file to AAT.xml and place it in the same directory as the import_aat.php utility. Invoke the import utility by running the program with the hostname of the CollectiveAccess database you wish to import into as an argument. For example (assuming you are in the support/import/aat_dutch directory):

php import_aat.php demo.CollectiveAccess.org

Note the import script will refuse to run if your installation does not have defined either the nl_NL (Netherlands Dutch) or nl_BE (Flemish Dutch) locales. Be sure to add one of these to your installation before trying to import the AAT. If you happen to have both locales in your installation, nl_NL will be used.

What is Imported

The data file includes information about the content and structure of the thesaurus, as well as administrative and background data, such as change histories, source citations and contributor lists.

The import_aat.php utility imports a subset of this data, including:

  • Preferred terms
  • Parent-child relationships between preferred terms
  • Alternate ("non-preferred" in AAT terminology) terms
  • Associative relationships between terms
  • Descriptive notes for preferred terms

Other information, including term source data, contributors, revision histories and non-current terms are not imported.

Common Problems

The import utility deals with a large dataset in the AAT, and requires a significant amount of memory when running. For many default PHP installations, the maximum amount of memory that can be allocated by a running program is 8megs, which may be too low. The utility attempts to raise the memory limit when it runs but depending upon your server/PHP configuration this may not be possible. If you receive "out of memory" errors, then try editing the memory_limit directive in your php.ini (configuration) file with a higher memory ceiling. 512megs is a good value, or disable the memory limit completely.

Using the English language AAT

The "real" AAT is the English-language version from the Getty Information Institute. This is the only version guaranteed to be completely up-to-date. You can learn more about it on the Using the AAT page.

i_sphinx

Namespaces

Variants
Actions
Navigation
Tools
User
Personal tools