Metadata is usually described as data about other data. As structured data, it helps to classify and identify the attributes of the information it describes. In Zen and the Art of Metadata Maintenance, John W. Warren described metadata as "both a universe and DNA." Metadata can be used to provide monolithic or multifaceted information for any materials. It is also used to summarize the basic information of materials to simplify the search process.
The History of Metadata
In 1968, Philip Bagley coined the word "metadata" in his book Extension of Programming Language Concepts. At that time, metadata meant "content about individual instances of data content," namely structural metadata, rather than descriptive metadata or metacontent commonly used in library directories. Since then, the term has been widely accepted in various fields, including information management, informatics, information technology, library science and geographic information systems.
In fact, it's very important to use metadata on web pages. Metadata contains the publication date/time, author, article length and other related descriptions, as well as the keyword tags of the linked content. On a search engine, it's possible to obtain accurate search results through metadata, which are called "Metatags."
This was used as an important factor for determining the order of search engine results before the late 1990s. In the late 1990s, however, many websites started to use keywords to fill their metadata to cheat search engines. The abuse of metatags misled many search engines to think that the contents of a page were more relevant to a query than they really were.
The Storage and Acquisition of Metadata
Metadata can be stored and managed in what is called the metadata registration system or metadata repository. However, without context and a point reference, it's difficult to identify these metadata simply by looking at them. To use an analogy, a database itself may contain some numbers, but these numbers may be the result of some complex computation. Or the number may be an ISBN code — used in books — or an ICD-10 code — used in health care — but the number cannot be directly accessed from the data container.
While web3 has shown a lot of potential and tantalizing promise since its inception, the importance of data has also gained much attention. In web 3, people start to own their data by storing data in a distributed manner, how to retrieve the data is a new problem to be solved. In web2, metadata has been used to summarize the basic information of materials to simplify the search process. So, it's also going to be even more important in Web3. Both data and metadata face the same issue: where should it be stored?
Several options are available. The first option is to place the metadata in a central location, such as a central repository. In such a central repository, users can easily and quickly find the information they are interested in by utilizing metadata, provided that the metadata in the database is correct.
From this point of view, we cannot ignore the hidden danger, such as the metadata being maliciously tampered with to manipulate the search results, making it impossible for users to read the files, or the files pointed by the metadata being moved and the metadata being no longer valid. The latter can be solved through technology to speed up the synchronization and update of files and related metadata, but even so, each update will have some impact on the database, while the former is a much more difficult problem. For such a huge database, protection and backup are a top priority.
In web3, we will not choose to store data or metadata in a centralized way. So, the second option is to put the metadata together with the data itself, then store and backup both in a distributed way, what to do next is to solve the problem of how to retrieve the metadata after being stored in a distributed manner.
When the topic comes to storage, we must mention the advantages of OOD (Owner Online Device) addressing of DMC(Datamall Chain)storage. The problem that metadata cannot be easily identified is remedied here. DMC adopts the DSG Protocol designed by CYFS to store data securely in the space where other users are limited to visit (data encryption). This enables multiple backups at different sites. The more the backups, the less likely the data will be deleted or get lost.
At the same time, this kind of backup requires the payment of a rental fee to the data storage provider, and these transactions can only be realized by contracts through the Datamall Chain. With such off-site backup, even if the OOD of a user is one day damaged at the same time, if he or she retains his or her private key, he or she can obtain the list of their backup data from the Datamall Chain when he or she restarts their OOD in any corner of the world.
A New Way of Data Fast Access and Retrieval
In DMC storage, users' data is stored in their own OOD through the CYFS Protocol, so as long as the OOD in which data is stored is located, fast access can be achieved. This cutting-edge innovation will create a unique data identifier (which we call the data URL) with owner ID.
As shown in the following figure:
The URL of data is unique and will not change. When other users need to access data, the users only need to parse the URL and use the owner_id of the first segment of the data to find the current IP address of the owner's OOD on the Datamall Chain, request the OOD point-to-point, and then request data from the OOD using the content fingerprint of the second segment of the data. This process only requires one search only. No matter whether the data exists or not, the retrieval is very fast.
The Future of Data
One of the key tenets of Web3 is data ownership. Metadata can help document (1) who created the data, i.e., the authenticity of the data, (2) whether or not the data are unique and (3) who currently owns the data. Whether it is NFT at present or Web3 to come in the future, there is a tendency to accurately record, store and retrieve metadata and the retrieval development process in the data warehouse field to ensure the authenticity of information value and protect the rights of information owners.
References:
1. Kranz, Garry. "What Is Metadata and How Does It Work?" WhatIs.com, TechTarget, 12 July 2021, https://www.techtarget.com/whatis/definition/metadata
2. "Metadata." Wikipedia, Wikimedia Foundation, 2 Nov. 2022, https://en.wikipedia.org/wiki/Metadata
3.Layton, Jeffrey "The Metadata Storage Problem" Enterprise Storage Forum, 22 May 2013, https://www.enterprisestorageforum.com/management/the-metadata-storage-problem/