MetaData we need to represent

  • embedded in the file/data object contents
  • external, provided by filesystem/container
  • external, provided by user/annotation apps etc. these maybe stored in e.g. filesystem extended attributes and may be stored in a separate db.

Embedded in the file/data object contents

Generic content

Content creators

  • author(primary contributor)
  • contributor(secondary contributor)
  • mantainer Author's annotation/categorization of the content

  • subject

  • title
  • description
  • comment
  • keywords
  • creation/last modification time
  • IDs(various)
  • plain-text representation for indexing purposes(not a good fit here, but it's closest) Content legal info:

  • copyright

  • disclaimer
  • license type
  • license text Content relations to other content:

  • containment

  • dependency
  • link/mention
  • conflicts? -- a must for software. do they apply to generic content?
  • supercedes? Content generator software specifics:

  • software name(or better yet semantic link :)

  • software options Content format description/specifics:

  • encoding/code page(transitional until obsoleted by UTF8).

  • languages
  • format subtype like BMP has or to specify format extensions.

External, provided by filesystem/container

Location/ID

Creation/access/modification time

ACL(access control)

Media

Generic

  • sample format
  • codec
  • duration
  • frame count
  • FPS Audio

Music

  • ID3 is designed to tag music Visual

  • image size

  • aspect ratio
  • resolution Video

  • Frame count

  • FPS Audio+Video

Photo:

  • EXIF is a good base for this ===Documents=== ===Messaging=== Generic message

  • Recipient Email

  • to

  • cc
  • bcc

Contacts

External, provided by user/annotation apps etc. these maybe stored in e.g. filesystem extended attributes and may be stored in a separate db.

User's annotation of content similar to content author's one. Note: To a degree file name is a part user's annotation.

Quality etc ratings

Usage intensity

Data types

Object

Content

Document

Text

Media

Audio

Music

Image

Photo

Video

Software

Source type

file

attachment

message