Amazon.com has added XML search and data handling technology to its Amazon Web Services (AWS) vending machine of cloud-friendly services, via partner Mark Logic Inc.
Mark Logic pitches its technology at organizations with very large collections of data and lets them build search and query applications very quickly, without the use of traditional relational databases to organize and index the data.
Mark Logic usually sells its software as a data center appliance, with an onsite installation that puts it close to the masses of data it uses. Virtualizing its flagship XML Server for AWS and letting users deploy it themselves for a tiny fraction of standard license fees might not seem like a smart move, but analysts are not fazed.
"It's actually a really good move," said Melissa Baxter, vice president of content and digital media technologies for IDC.
She said that Mark Logic has proven its technology over the last several years and hit a critical mass of large, paying customers -- Edmunds.com and the U.S. Patent Office, among others -- and interest in "unstructured data" technologies and cloud are running very high.
Mark Logic is offering its XML Server running on an Amazon Machine Image (AMI) and also making it available as a VMware machine image for virtualized users who prefer to stay in-house.
Storage and servers have gotten cheap and, according to Baxter, enterprises are sitting on ever-increasing mountains of information unable to fit into a traditional database application that they want to put to use. Coupled with cloud computing, which means firms don't have to invest capital in IT projects, and the opportunity looks good.
"People's attitudes about crunching huge data sets have really changed – that's partly [due to] cloud," she said.
What Mark Logic brings to the table
Mark Logic's software takes advantage of XML, nearly ubiquitous these days in media files, documents, Web pages and information sets, to sort and return results on masses of information without having to load the data into a relational database first. For the appropriate information sets, that removes an enormous chunk of operational overhead to managing and using data.
A batch of Microsoft Office files, for instance, can be sorted and queried to return results as if all the content had been indexed into a database, without having to change the files in any way.
Baxter said that Mark Logic's partnering with Amazon will let users experiment with the technology and said that the move is low-risk for Mark Logic. They didn't have to invest heavily to use Amazon, and they charge on top of Amazon's regular fees. On Amazon's side, users will put data in S3 and probably stick around, even if they don't use Mark Logic a little bit. They don't stand to lose even if usage is very low, and developers can experiment with having a massive investment in their data centers.
Mark Logic's John Kreisa, the director of industry solutions, said the startup, which is still venture funded, has around 200 customers. He added that largest data set using Mark Logic clocks in at around 150 terabytes, and that the nature of their technology is in sync with virtualization and cloud. XML is a key standard for most web-based applications at this point, and Mark Logic's querying engine is a distributed, "shared nothing" application that is designed to let many servers work in concert.
"Shared nothing" DBMS, also called Multiparalell Processor (MPP) engines are coming increasing into vogue , with companies like Netezza, Vertica, Teradata and IBM's DB2 leading the charge to take advantage of cheap servers and mass amounts of loose data. While DB2, which dates back to the first implementations of SQL and also includes native XML support, is a primary competitor to Mark Logic, the firm prefers to style itself as a replacement for Oracle, which often comes off badly in performance comparisons to next generation DBMS.
Kreisa said that Mark Logic Cloud Services, which now consist of Mark Logic server bundled into an AMI, will run from $2.00 per hour to $14, depending on the type of EC2 instance used. The VMware image will be sold under the existing software license model, which Kreisa has said is comparable in price to Oracle.
Carl Brooks is the Technology Writer at SearchCloudComputing.com. Contact him at firstname.lastname@example.org.