MetaData we need to represent
- embedded in the file/data object contents
- external, provided by filesystem/container
- external, provided by user/annotation apps etc. these maybe stored in e.g. filesystem extended attributes and may be stored in a separate db.
Embedded in the file/data object contents
Generic content
Content creators
- author(primary contributor)
- contributor(secondary contributor)
mantainer Author's annotation/categorization of the content
subject
- title
- description
- comment
- keywords
- creation/last modification time
- IDs(various)
plain-text representation for indexing purposes(not a good fit here, but it's closest) Content legal info:
copyright
- disclaimer
- license type
license text Content relations to other content:
containment
- dependency
- link/mention
- conflicts? -- a must for software. do they apply to generic content?
supercedes? Content generator software specifics:
software name(or better yet semantic link :)
software options Content format description/specifics:
encoding/code page(transitional until obsoleted by UTF8).
- languages
- format subtype like BMP has or to specify format extensions.
External, provided by filesystem/container
Location/ID
Creation/access/modification time
ACL(access control)
Media
Generic
- sample format
- codec
- duration
- frame count
- FPS Audio
Music
ID3 is designed to tag music Visual
image size
- aspect ratio
resolution Video
Frame count
- FPS Audio+Video
Photo:
EXIF is a good base for this ===Documents=== ===Messaging=== Generic message
Recipient Email
to
- cc
- bcc
Contacts
External, provided by user/annotation apps etc. these maybe stored in e.g. filesystem extended attributes and may be stored in a separate db.
User's annotation of content similar to content author's one. Note: To a degree file name is a part user's annotation.
Quality etc ratings
Usage intensity
Data types
Object
Content
Document
Text
Media
Audio
Music
Image
Photo
Video
Software
Source type
file
attachment
message