− Ken Sakamura −
Digitalization of museum data - When carrying out digital archiving, the format of the digitalized data is required to have diverse attribute description capacity while at the same time being general, in a tradeoff relationship.
For example, if storing one item of earthenware in the digital archive, related data may include shape data, texture, X-ray CT (Computer Tomography), soil composition analysis data, written commentary, each of which has its own relationships. For example, for exterior and CT data it is desirable that the positional relationship between CT cross section and shape is mapped three-dimensionally. The same applies to what part of the exterior that the written commentary applies to. Distinct from this information, data indicating the excavation site from which the earthenware and the state it was in is required. On the other hand, when it comes to ceramics from the latter ages, data from a chemical analysis of glaze is likely to be added, and conversely, instead of excavation data, it may be necessary to describe the work's origin, such as the name of the craftsman, and the warehouse of the old family via the hands of which feudal lord. Furthermore, for ancient documents, it will probably be necessary to add information pronounced using the pronunciation of the time, in which case it becomes necessary to align the time and the document. In this way, when the properties of individual materials are focussed upon, it becomes necessary to prepare multiple types of diverse data formats for each of multiple genres.
On the other hand, however, to research the time direction of how the patterns of the ceramic ware affected those of later years, and the horizontal relationship of how they affected the patterns of other items from other areas, it is desirable that the format of the data that has been digitally archived not be individualized for each genre, but be generalized as far as possible. In addition, in order to achieve the 21st century decentralized museum concept, it is desirable that the databases possessed by the museum have a standard format. Furthermore, the generalization carried out in this case is not a simple generalization involving writing a commentary to an attached text file with the expectation of having people read and understand it. The data structure must be decided upon properly, meaning must be defined for each item, and it must be able to be understood and processed by the computer. If that is not the case, then it is impossible to carry out an integrated search extending across multiple museum databases.
To answer this need, we developed the attribute description data format for the museum which we refer to as Museum TAD. TAD is an abbreviation for TRON Application Data-bus, and is the name of a data format standardized across applications of TRON - the computer framework researched and developed by the Sakamura Laboratory, The University of Tokyo. The character code which acts as the foundation of text data within TAD is referred to as TRON code. It is an open code which effectively has an unlimited character storage capacity, and 130,000 characters from past and present, East and West have been stored. Character collection continues to be carried out. The museum requires many archaeological characters that are not in ordinary character codes, but Museum TAD was developed as a variation of this TAD, so there are no problems with characters that cannot be used.
A problem more difficult than the character problem, however, is the integration of the two demands of diversity and generality mentioned above. This is because each museum, each museum department, and each genre of materials has its own attribute description that it requires. Some of these may be the same, but some may be different. If a labored generalization is made, the descriptive capacity drops, and conversely, if special examples are accommodated, then the generality is lost. We are not assuming something like the full-blown artificial intelligence as the understanding of a computer. Therefore the "understanding" referred to here means a standard description at the level of attaching metatags between different computers and being generally able to understand what type of processing can be carried out at the computer receiving the metatag. However, to decide on the set of metatags requires that a complete attribute set be stipulated. There is also the constant possibility, however, that new discoveries will result in a need for new frameworks for attribute description. This means that it is impossible to stipulate a complete attribute set from the start.
That is to say, a framework is required which links diverse data (which may undergo unlimited increases in data types), and transforms them into an object as one "stored item." We therefore used an object-oriented language to develop PCO (Portable Compound Object) - a description framework capable of generating new attribute description sets by adding the minimum required definitions from the original class to local. We then had those definitions decentralized and saved on the network, and when required for interpretation, referred to via the network. In the network, the fact that that tag "Can be understood" means that the definition of that tag is a relationship with more basic flags, and it defines what processing can be carried out, and an environment is created in which those definitions can be obtained if necessary. By adopting such an environment, it has been made possible to add tag definitions at any time that are detailed for each application.
Using PCO, the properties for each category of exhibit is defined as a description class library for each expandable sector. The description for specific individual exhibits is created by inserting specific values to parameters as an instance for that class. Classes can have a succession relationship, so by adding the necessary properties for individual genre from the basic class called "Earthenware," for example, specific classes such as Jomon earthenware and Yayoi earthenware are defined.
By using such a framework, it is possible to minimize the volume of communications to define properties in networks. By caching class definitions, definitions of sectors that are used often are not communicated, making it possible to perform the minimum of communications only in cases where there are unfamiliar tags within data that has been uptaken. By making them object-oriented it is possible to trace back the succession relationship of individual classes and find out how mutual items have developed from basic classes, and it becomes possible to translate the properties described based on two different classes in accordance with the mutual relationships.
If someone attempted as a project to achieve the stipulation of complete attribute sets for all materials in museums throughout the world, it would be impossible simply because of the vast quantity of labor required. However, the widespread operation of such a framework is expected to enhance in a decentralized manner the description of museum materials, which are the intellectual assets of mankind, as a net of attribute sets related to one another by many people with an incentive of being for one's own university, rather than volunteers for the whole.
In the future it is likely to become possible to connect all museums across the world on a network, and to perform high-speed data exchange at will. That being the case, it will also be possible to connect to display floors of other museums through a virtual space. This will mark the realization of a meta-museum in which museums throughout the world are linked with one another.