Automated Title Material Authorship
Much of the technology used by EdgeMaven relies on databases created by and approaches patented by Professor Philip M. Parker, INSEAD Chair Professor of Management Science. Since the patent was issued in September 2007, press coverage has lead to a number of queries about content authoring strategies and alternative approaches. The following discussion is an introduction.
Some like calling it a “book writing machine” or “software” but in fact, the technology is a computer-based automation process for authoring, irrespective of the format (book, video, PC games, etc.), language, or subject (fiction or non-fiction). For those interested in the technical aspects of the process, please refer to the actual patent which presents flow diagrams, etc., and to a YouTube video that tersely describes the process and shows an example an application and some output:
It is strongly recommended that interested persons read the full patent. On the patent page, the reader will find detailed technical descriptions of the process and the prior art. Professor Parker began working on this project in the early 1990s. The goal was to create original titles (book, videos, games, etc.) on topics that would not be economically viable if published using traditional methods, or covering topics that might be of interest to a limited audience that would nevertheless find the titles useful (what some call the “long tail”). The process does not require “Internet scraping”, and most existing implementations of the process are Internet independent. The patent is written as a “pioneer patent” as it applies to all forms of original title materials (videos, books, PC games, etc.) created in this fashion.
It is convenient to see the process as resulting in three methods of authorship automation (methods can be used in combination).
Method 1: Compiles existing information, sorts, formats, and draws basic conclusions (e.g. if there is no pre-existing content, then this fact alone may lead to original logical conclusions drawn about the topic). This level is useful for consolidating and structuring knowledge in a domain where much of the text, video or sounds pre-exist. The programming for this method typically involves hundreds of details, especially with respect to formatting and style. Typically in the form of a compilation, some of the output components will be original and can result in new knowledge.
Method 2: Involves replicating a formula within a genre. In this case, new knowledge is not necessarily generated, though the reader or viewer may end up acquiring new knowledge. In this case, the data (words) may be in the public domain on a stand alone basis, but the output is as original as what a human author (or director, screenwriter or actor in the case of a movie) might create. The final result is typically wholly original.
Method 3: Involves the generation of new knowledge as the primary goal. This involves, for example, the computer mimicking a specialist that is asked to prepare a report, film or game that draws original conclusions, images or levels of entertainment. For example, if one asks an economist for an opinion, the economist will typically perform an analysis and make summary statements based on his or her findings. The automation process, in this case, literally follows the behaviors of the economist, and reports the findings -- findings that have never appeared before in any format or which pre-exist in any database or are currently available on the Internet. The computer, in this case, is pre-invested with knowledge or expertise (e.g. economic models and knowledge of economic geography). For this method, the word “specialist” is domain independent. We can rewrite the example from above to be:
“For example, when one asks a poet for a poem on a given subject, they will typically ponder on the subject and write prose based on their inspirations. The process, in this case, literally follows the behaviors of a poet, and creates a poetry book – consisting of poems that have never appeared before in any format.”
The distinction between poetry and econometrics is the differing formulaic natures of the genres, but not the process to author them. The third level can create high-end econometrics to the same degree that it can write poetry. It turns out that the most useful applications at Level 3 are for genres that are so complex or labor intensive, that automation is almost the only viable approach. Writing hundreds of original high-quality Ph.D. theses will be easier to accomplish using this approach, than writing a single original and high-quality children’s story (given the lack of formulaic sub-genres that can be reverse engineered). “Human creativity” in this sense is the absence of formulaic authorship techniques that can be reverse engineered. Creative authors, therefore, need not fear being replaced by this process. The same is true for creative doctoral students, moviemakers, television producers or PC game makers.
Research began on this approach in the early 1990s. The first titles authored via full automation relied on Method 3 (described above) – having the goal to generate of new knowledge that would be difficult to accomplish otherwise. These came in the form of e-books distributed via high-end distributors dedicated to this market (Dialog, MarketResearch.com, etc.) and then print-on-demand titles (Ingram’s LSI and Amazon’s Booksurge). The “Trade Perspective” series was created due to the inconsistencies of import data from importers, and export data from exporters. The model comes up with maximum-likelihood estimates of real trade flows (adjusting for currency fluctuations) – a rather boring process but of interest to people involved in international trade. This series is mostly used by government agencies and businesses. Similar series using Method 3 are “Word Outlook Reports” that produce Bayesian econometric estimates for the worldwide latent demand for various products and services, and the “financial and labor benchmark” series which mimic the process used by accounting firms and/or investment banks to compare real differences in economic performance across firms and/or economies with differing accounting rules. For each of these series, there is a very large upfront cost to creating a series like this (many man-years of programming in most cases), but once this is accomplished, the incremental cost per title is very low (the costs mentioned by journalists are the incremental cost of about 50 cents, not the total or average cost per title which are must higher when considering start-up costs). Samples of these books can be found at http://www.icongrouponline.com/browse/.
Later, series using a combination of Method 1 and Method 2 were created in the form of patient and physician sourcebooks. Around 2001, medical libraries launched efforts on “Internet training” for their patrons (e.g. how to use the Internet to research diseases). This series was created for this market and is mostly distributed via OCLC’s NetLibrary service in ebook format. Method 1 was also used to create a series of bi-lingual Classics titles which provide a running thesaurus in the language of the reader.
More recently, multilingual crossword puzzle books and thesauri were created using Method 2. Some of the thesauri rely on a graph theoretic approach (combined with traditional computational linguistics) to derive what is probably the world’s largest multilingual thesaurus.
A small percent of the databases required for some of these later genres is posted on Webster’s Online Dictionary (http://www.websters-online-dictionary.org), that was started in 1999 as a testing ground for the general approach (i.e. the automatic authoring or original content on a web site):
Some Background & Reviews:
As only 10% of the data available are posted, future editions will be substantially larger and allow for high levels of interactivity.
In terms of R&D, substantial time and effort is currently being invested to create (1) a series of interactive web sites that can automatically author titles, (2) educational game shows and (3) language learning programs. With respect to video, instead of automating “Word” to author a book, the same process is being used to automate Maya and video editing software (software for 3d animation/video used in movies like King Kong, the Matrix, and Shrek). The goal is create video programming to teach any concept, but also in any local language. It turns out that for most of the World’s languages (e.g. Estonian, Maltese, etc.), the costs of video programming using traditional methods is prohibitive, so local stations end up dubbing foreign-based programs (or programs receiving government subsidies). This project started in 2004 with 3d games and software (a bit easier to begin with than video) which has resulted in hundreds of titles distributed by Digital River, Handango and Microsoft (for Pocket PC versions) among others. The following is a youtube link to cut scenes from a game show designed for language learning – a formulaic form of television (that is being coded for automation):
The following is a video “word of the day”, that will be used across many languages:
Here is a cut scene from the 3D game:
This FAQ covers other common questions. For each question, the generic answer is typically “it depends on the genre” or “it depends on the format (book, video, software, PC game, etc.).”
Q: Can I have a copy of the software?
A: No. The process is not a software package, but a complete system that requires that a computer or computer network be set up for this purpose – for a particular genre. Most genres are too large to be easily transferable via the Internet. The current video application is many terabytes, and others are many gigabytes.
Q: How long does it take to set up a genre?
A: This completely depends on the complexity of the genre and the quality one is willing to accept for the titles. The earliest genres took several man-years to create before they met industry standards (i.e. to the quality of a human author). The later genres took a matter of months (e.g. cross-word puzzle books). Sometimes the longest part is acquiring and coding domain knowledge (e.g. knowing how a Ph.D. thinks in a particular domain before they author a genre).
Q: How much does it cost to produce a book or other title?
A: Depends on how you define cost. The marginal cost of creating a title in electronic format is the price of the electricity used to create the title, and some small amount of hardware depreciation (maybe around 50 US cents). The average cost, which includes the printing of the book (in paperback), or a game in DVD or CD format (printed on demand), and the overhead to distribute the book can range from around $10 to around $30. The total cost for an entire genre of books, videos, or software games can exceed hundreds of thousands of dollars or more in programming time, database acquisition or licensing, and other overheads. Once a large sum of sunk costs are expended, the marginal costs are minimal. For video or high-end gaming, the costs can be very high; with the budget to create a single traditional 3D animated movie, however, one can use this approach to create thousands of titles within a given video genre.
Q: Is this really that complicated?
A: It depends on the genre and format. During early genres it was found that rather complicated issues were simple to implement (e.g. Bayesian econometrics), and logically simple things were nearly impossible to implement (e.g. getting Windows to behave well when indenting certain graphics, or rendering in DirectX). In general, Joseph Weizenbaum says it all:
'It is said that to explain is to explain away. This maxim is nowhere so well fulfilled as in the area of computer programming, especially in what is called heuristic programming and artificial intelligence. For in those realms machines are made to behave in wondrous ways, often sufficient to dazzle even the most experience observer. But once a particular program is unmasked, once its inner workings are explained in language sufficiently plain to induce understanding, its magic crumbles away; it stands revealed as a mere collection of procedures, each quite comprehensible. The observer says to himself, "I could have written that." With that thought he moves the program in question from the shelf marked "intelligent" to that reserved for curios, fit to be discussed only with people less enlightened than he.'
Q: Professor Parker’s patent mentions the word “original”? What does that really mean?
A: From a pragmatic point of view, if one title borrows from another to a sufficiently large degree (especially without citation), it might be considered un-original, if not plagiaristic. If the two titles have so little in common that they do not seem to borrow from each other, one might say they are originals (e.g. a romance novel that has a formulaic plot, but uses different sentences and paragraphs that do not overlap to any noticeable degree with an existing romance novel with exactly the same plot). This form of originality is seen often in television game shows. Each episode is original, but each episode uses the same segment sequences. Original, but not that creative from one episode to the next. The genre in its entirety, however, can have a very creative result.
Q: Will this make human authorship obsolete?
A: Potentially yes, for at least the formulaic or mundane forms of human authorship, or human authorship for genres that are uneconomical otherwise. Which genres of authorship (in video, text, or other formats) are not formulaic enough to be automated? Time will tell.