CTO Operations

Magazine on Entrepreneurship, Project Management, Marketing and Products

Related Content for News, Video, Images


A Keyfactor to increase Uservalue for a given Website is to connect and relate content.
Important factors for this are:

  • Do we have text or just images or videos?
  • how much text do we have exactly?
  • If we have just images and Videos: Can we extract Text?
  • When looking at our Content, the question is: what is the lifecycle of an average piece of content

The Main Usecases are relevent for our research:

1.) Newscontent: Hot, International Business News

The Avarage Newsstory got a Lifetime of 1hour to 7days.
Source: Content sources are mainly internation News-Agencies and international TV Stations. In Certain Cases also Pressreleases, local Journalism and other Sources can be considered
Content Type: We got mainly Text-Content and Stockimages. Sometimes we got recent images.
Content Lenght: Textlenght is mostly between 50 and 250 Words. Local News tend to be a bit longer
Correlation Base:  Names, Brands, Locations and Events

Interesting WordPress Components
WP Calais Archive Tagger
veeebeditor For WordPress
WordLift allows you to add rich Snippes to a given WP Post
Wisetext Similarity Search technology helps to find related images, videos, and other content to your blog post as you edit

2.) Magazine Content: Motor and Car News

Considering Magazine Content, we have a whole different story here. In Magazine, we have Text and High-Res Images, in our Example of Cars.
Source: Magazine Content is either handwritten or comes from PR Agencies.
Content Type: Unique Text and Images there were taken from a Photodatabase, or, more seldomly taken for this purpose. “Boulevard News” are a special case, since these Stories are mainly Picture driven and Image-rights are hard to get.
Content Length: Text can be from 100 Words to 800 Words, in case of a Fashion Report. Mostly there are many High-Quality Images involved. The Lifecycle of Magazine Content is much longer: From 1 Month to 3 Years, depending on the Industry. Some Content might exist permanently.
Correlation Base: Names, Brands, Locations and Events

Open-Source Components for Semantic Search:
Social Semantic Search and Browsing based on Java
Semantic search base on Java

Advanced Search using Solr

SAAS Services:
Nrelate: Content Marketing
Outbrain: Content Marketing

3.) Multimedia Content, especially Video

Video and Image Content is currently not used as own Media-Type, since its useage is much less intuitive. Mostly Videos and Images are part of a News or Magazine Stories, sometimes, they are Stand-alone Content. Real standalone video or Image Content can only be seen in Photo or Videocommunities like Youtube, Vimeo or Flickr.
Source: Newstories are produced by TV Stations or News-Agencies, sometimes Footage is taken by Agencies or Individuals (Cat-Videos)
Content Lenght: Flipping trough images is an interative process, and therefor got lower bounce rates. Videos tend to show higher Bounce-rates: Typical Bounce-Times when watching a video is: Until 15 Secs, 25 Secs, 30 Secs, 1:10Min, 3:20Min

Issues: Similar to automatic Voice recognition, the quality of textrecognition is limited. Therefore results are not perfectly percise
Correlation Base: Extracting from Images and Videos: Names, Brands, Locations and Events

Video to Text
What is Bouncerate? Watch Video

[media http://www.youtube.com/watch?v=lGPnzSBVBOk]
Good Tipps about Video SEO:

4.) Print-Content brought to digital

Similar to Video and Image Content, this creating is not really made for the web. Therefor we need technical workarounds to resolve these issues. A typical Workaround is ocr Software, software that has the goal to identify text-content inside an image.
Source: Marketing or Corporate commnuication
Content Length: Very variing, depending on the source: 1 Word to 800+ words
Issues: Similar to automatic Voice recognition, the quality of textrecognition is limited. Therefore results are not perfectly percise.
Correlation Base: Extracted Text and image recognition

OCR Modules
5 Free OCR Programs
Free OCR Software
German OCR Software

Interesting Links on Sematic Search

Infrastructure for parser based text analysis in Emacs
Smart Linking of Data-Sets using Media Wiki
Semantic Link Directory and Social Bookmarkting Tool

Author: Guntram Bechtold

Guntram Bechtold lebt mit seiner Familie in Dornbirn. Gemeinsam mit Freunden und Geschäftspartnern entstehen Angebote wie die BusinessLabs, die AgentConf, die WorkerConf oder die Plattform für Digitale Initiativen. Der passionierte Läufer ist für seine Livestreams auf dem Fahrrad bekannt und gefürchtet. Guntram arbeitet bei der Vorarlberger Digitalagentur StarsMedia und hilft mit den Smart City Dornbirn Wettbewerb zu organisieren.

Leave a Reply

Required fields are marked *.