Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

database design - What tag schema(s) are the most efficient/effective?

http://tagging.pui.ch/post/37027745720/tags-database-schemas

Stackoverflow's tag handling is among the best that I've seen so far.

Does anyone know if it is a schema pattern I could get some ideas from?

Otherwise, I'm just looking for suggestions on what tag schemas others have successfully implemented.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It all depends on data volumes and content to tag distribution and density ratios

If you have a low tag distribution and density ratio (typical human generated data) you can simply generate an unique id or hash for each possible collection of tags in use by the data. Associate the 'tag collection' id with each data instance with those tags

This can work surprisingly well for many forms of human generated data

e.g. Stackoverflow has ~500,000 questions, and ~20,000 tags (too many dupe-ish tags!). Most questions have less than five tags. At worst case scenario you will have 500,000 'tag collection' id's to associate , but more realistically you will have several thousand

You also will either have to have instance tracking or garbage collection on the 'tag collection' collection as specific combination of tags fall out of use

e.g.

  • Tag: id, tagName
  • TagCollection: id, instanceCount
  • TagCollectionTag: tagCollectionIId, tagId
  • Data: id, title, content, tagCollectionId

Inserting tags is fast if a hash is used (hash on all tags of the collection). Otherwise you have to search the TagCollection and TagCollectionTag collections, but this should not be too large anyway

Searching is fast; search TagCollectionTag for instances containing the specific set of tags, and then find data rows with any of those tagCollectionId's

Hope that wasn't too confusing :-)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...