The Getty Art and Architecture Thesaurus
Using the Getty Art & Architecture Thesaurus (AAT)
From the Getty Research Institute web site:
The Art & Architecture Thesaurus (AAT) is a structured vocabulary of around 34,000 concepts, including 131,000 terms, descriptions, bibliographic citations, and other information relating to fine art, architecture, decorative arts, archival materials, and material culture.
CollectiveAccess has supported the AAT since its initial v0.50 release. The core features of the AAT, including parent-child relationships, associative relationships, the ability to link any number of alternates to a term, support for guide and candidate terms and storage of term documentation (scope notes and more) are implemented in CollectiveAccess. Vocabulary terms may be directly linked to object, authority (entity, place, collection and occurrences) and vocabulary data, as well as be used to enhance retrieval in searches.
Obtaining and Importing the AAT File Set
Once you have obtained your license, the Getty will send you a URL which provides download access to the AAT data in several file formats, including XML, XML UTF-8, MARC and "relational." CollectiveAccess includes a utility to import the XML UTF-8 format data. Only the XML UTF-8 data will work. If you mistakenly use the plain XML format files, all diacritic and non-latin characters will appear incorrectly. The other formats will not work at all.
The import utility is a PHP script named import_aat.php located in support/data/aat/. After downloading the XML UTF-8 file set, decompress it and place the AAT.xml file contained in the set in the same directory as the import_aat.php utility. Invoke the import utility by running the program with the hostname of the CollectiveAccess database you wish to import into as an argument. For example (assuming you are in the support/data/aat directory):
php import_aat.php demo.CollectiveAccess.org
What is Imported
The data files provided by the Getty Research Institute include information about the content and structure of the thesaurus, as well as a wealth of administrative and background data, such as change histories, source citations and contributor lists.
The import_aat.php utility imports a subset of this data, including:
- Preferred terms
- Parent-child relationships between preferred terms
- Alternate ("non-preferred" in AAT terminology) terms
- Associative relationships between terms
- Descriptive notes for preferred terms
Other information, including term source data, contributors, revision histories and non-current terms are not imported.
The import utility deals with a large dataset in the AAT, and requires a significant amount of memory when running. For many default PHP installations, the maximum amount of memory that can be allocated by a running program is 8megs, which is may be too low. The utility attempts to raise the memory limit when it runs, but depending upon your server/PHP configuration this may not be possible. If you receive "out of memory" errors, then try editing the memory_limit directive in your php.ini (configuration) file with a higher memory ceiling. 512megs is a good value, or disable the memory limit completely.
Accessing the AAT via the Getty web service interface
As of July 2009, it is possible for AAT subscribers to access the latest version of the AAT through a web service provided by the Getty Information Institute. We are considering support for this web service via a new attribute type. AAT web service support would be provided in addition to direct importation of AAT data, and would not supplant importation. The web service option ensures that the latest version of the AAT is always used, but direct import is probably more suitable for most installations because of the superior performance and flexibility it provides.
Using the Dutch translation of the AAT
There is a separate import utility for the Dutch version of the AAT. You can learn more about it on the Using the Dutch AAT page.