Friday, November 18, 2005
Chris Suver, Microsoft
What folks have discovered is really the effect of economics on data typing . . . Simple economics controls whether data is typed or untyped . . . The essential point is that the structuring (or typing) of the data is only partially complete. It’s not that the data is intrinsically different, nor that it doesn’t have type (or can’t be typed). It is simply that the effort to fully define the type of the data is not worthwhile. As a result, the author adds only as much type information as necessary to satisfy the immediate needs. Keep in mind that semi-structured data is a task left uncompleted rather than something fundamentally new. It is simply that the information has only part of its type information in place.
. . . At one extreme is the data that is stored today in SQL databases, such as accounting data (demonstrably high-value data).
. . . At the other extreme is data on the Web. On the whole, this is very low value . . . Search is heavily used, even though the bulk of its raw data is of no value, and the results are often noisy. Low-value data, but aligned with low cost, makes this a cost/benefit win. As a result, this is one of the most widely used tools on the Web, another clear success.
. . . I refer to XML typing as a soft system because the typing is applied to the data when the XML is processed. Often the types used by the sender and the receiver are different. Sometimes these are large differences, sometimes small, but any difference means that the transport itself must be softly typed so that it can be easily adapted to the different uses. So, XML’s authoring cost is low. The author can choose to add as much meta-data (i.e., structure) or as little as appropriate. Here, again, is a case where we have an excellent match between the technology and the user. From this point of view it makes sense that XML has been well-received.
Links to this post:
Comments: Post a Comment