Posts tagged "apache"

Apache Software Foundation Divorces JCP over Irreconcilable JSR Differences

The Apache Software Foundation (ASF) today announced its resignation from the JCP (Java Community Process) Executive Committee (EC). This comes in only a day after Java SE 7/SE 8 specs (JSR-336 and JSR-337 respectively) were officially approved by the JCP despite the ASF, Eclipse Foundation and Google voting against.

Evil Plans and Stirring the Pot

The conflict goes back to 2006 and, most recently, has been around Oracle's (who acquired Sun Microsystems - the inventors of Java) refusal to grant a Java TCK (technology compatibility kit) license to the open source version of Java called the Apache Harmony project. Apache has objected to the restrictions over Harmony and threatened to leave the JCP. Since then, the differences has not been ironed out.

JCP is responsible for selecting which technologies to approve as official Java specifications. Just like it happened in the past with the near and dear to our CMS hearts JSR-170 and JSR-283. The JCP is supposed to foster an open specification process and protect the open licensing structure.

However, the are fears that Oracle is taking control over the JCP, which is supposed to be an unbiased and independent body. Oracle wouldn't agree to grant a Java compatibility license for the ASF's Harmony project.

This may indicate that Oracle is trying to keep a tight reign on any alternative implementations of Java other than their own version, while backing the OpenJDK open source version of Java.

No Harmony in the Java World

With these restrictions on distribution, the Apache Software Foundation decided to leave the JCP in a post published today, saying:

By approving Java SE 7, the EC has failed on both counts: the members of the EC refused to stand up for the rights of implementers, and by accepting Oracle's TCK license terms for Java SE 7, they let the integrity of the JCP's licensing structure be broken.

The Apache Software Foundation concluded that JCP is not an open specification process and that "the commercial concerns of a single entity, Oracle, will continue to seriously interfere with and bias the transparent governance of the ecosystem."

Since it is not possible to protect the rights of implementers and to distribute independent implementations of JSRs under open source licenses without the fear of litigation from Oracle, Apache decided to express its disdain for JCP with an immediate resignation and removal of all official ASF representatives from "any and all JSRs."

One can only wonder (or be slightly depressed?) about what kind of implications this development may bring on the content management industry. Many Web CMS and Enterprise CMS products are Java-based. Many of them are open source.

While many large enterprise are comfortable with Oracle and Java as their language of choice, many of them also use open source technologies like Apache Tomcat and the likes. Above all, Apache has a reputation of bringing innovation to the table with its projects. Innovation is not the prime factor that drives the money-making machine that is Oracle.

As we discussed before, Oracle may have a considerable impact on the industry, from many different angles. Not many of them were without controversy. Care to share your thoughts?

Denial of Service on an Apache server

Last week was a very frustrating time for me. For whatever reason, an unusually number of botnets decided to zero in on my Drupal site and created what I call an unintentional  Denial of Service attack (DOS). The attack was actually from spambots looking looking for script vulnerabilities found mainly in older versions of e107 and WordPress. Since the target of these spambots were non-Drupal pages, my Drupal site responded by delivering an unusually large number of "page not found" and "access denied" error pages. Eventually, these requests from a multitude of IPs were too many for my server to handle and for all intents and purposes the botnet attack caused a distributed denial of service that prevented me and my users from accessing the site.

These type of attacks on Drupal sites are nothing new and have been observed and discussed at great length at However, my search at as well as Google didn't really find a solution that completely addressed my problem. Trying to prevent a DDoS attack isn't easy to begin with and at first the answers alluded me.

I originally looked at Drupal for the solution to my problems. While I've used Mollom for months, Mollom is designed to fight off comment spam while the bots attacking my sight were looking for script vulnerabilities that didn't exist. So with Mollom being the wrong tool to fight off this kind of attack, I decided to take a look at the Drupal contributed model Bad Behavior. Bad Behavior is a set of PHP scripts which prevents spambots from accessing your site by analyzing their actual HTTP requests and comparing them to profiles from known spambots then blocks such access and logs their attempts. I actually installed an "unofficial" version of the Bad Behavior module which packages the Bad Behavior 2.1 scripts and utilizes services from Project Honey Pot.

As I had already suspected, looking for Drupal to solve this botnet attack wasn't the answer. Pretty much all Bad Behavior did for me was to take the time Drupal was spending delivering "page not found" error pages and use it to deliver "access denied" error pages. My Drupal site is likely safer with the Bad Behavior module installed, but it was the wrong tool to help me reduce the botnets from overtaxing Drupal running on my server. Ideally, you would like to prevent the attacks ever reaching your server by taking a look at such things as the firewall, router, and switches. However, since I didn't have access to the hardware, I decided it was time to look at my Apache configuration.

I host my sites on a VPS and use cPanel to help manage the site. While cPanel's defaults will give you a stable server there is definitely room to improve the default configuration. Despite all the places I searched for answers, the Apache documentation itself was the most helpful in helping me find which Apache HTTP Server configuration settings I should look when addressing DoS attacks.

I eventually looked at two directives to help resolve my DoS attacks, MaxClients and TimeOut. For whatever reason, cPanel chooses a default value of 150 for MaxClients even though Apache's default is actually 256. Knowing that whenever the MaxClients were reached, my server wasn't accessible to the client, this was the first httpd directive I wanted to change. Raising this number seemed to delay the effects of the botnet overwhelming my server but it didn't quite solve the problem. Now instead of 150 bot requests being capable of stalling out my server, I could process 256 bot requests. All MaxClients did was invite more disrespectful people to a party that was already getting out of hand.

So I moved on to what was ultimately the solution in my case, I lowered the value given in the Timeout directive. The value configured for the Timeout directive is the amount of time the server will wait for certain events before failing a request. Apache gives the following security tips for how this can be configured to help prevent DoS attacks:

The TimeOut directive should be lowered on sites that are subject to DoS attacks. Setting this to as low as a few seconds may be appropriate. As TimeOut is currently used for several different operations, setting it to a low value introduces problems with long running CGI scripts.

For whatever reason, this directive by default is set for 300 seconds. While I can see a number of reasons why you might need five minutes to run a process before failing the request, that's a value I would be more comfortable to have on an intranet server (fully protected by firewall from the wild wild Web) than on an Internet server. So I lowered the TimeOut directive from 300 seconds to 10 seconds. After the value change, the average requests being processed at any given time dropped from 256 down to around 40.  Most Drupal sites are going to need more processing time than 10 seconds, so you'll find out as I did that this number needs to be higher than 10. So far, I have found a value of 45 for the TimeOut directives allows my site to keep server performance high while handling all those requests from the bots without killing legitimate Drupal related processes.

So in the end, if you find that the spambots are overwhelming your Drupal site and you have the ability to override the httpd configuration file, try lowering the value of your Timeout directive down to 45 or some other low number. Doing this first might just solve your your problem and prevent the need for you to write a long winded blog post about your experience.

Open Source Search Developers and Industry Experts to gather in Prague for Inaugural Apache Lucene EMEA Conference

European Conference Announced For 18th-21st May; Showcasing Customers, Developers, Technology and Search Innovation

San Mateo, CA (April 22nd 2010):  Lucid Imagination, the commercial company for Apache Lucene and Solr open source search technologies, today announced details of the first conference in Europe dedicated to Lucene and Solr.  The event will be held in Prague, Czech Republic, May 18th-21st. Full details can be found at The conference, presented by Lucid Imagination, is running as a not-for-profit venture, with net proceeds donated to The Apache Software Foundation.

Apache Lucene EuroCon builds on the accelerating marketplace adoption of Lucene/Solr open source search technology, challenging legacy technologies from costly, proprietary platforms. Technologists and executives alike recognize that they need search applications tailored to fit their business needs and their marketplace. Attendees will learn how Apache Lucene/Solr open source community innovation is overtaking conventional technologies, unleashing new and innovative search-enabled applications that deliver competitive advantage, scalability, and cost economies.

“Europe has emerged as a key hub of activity within the Lucene/ Solr developer community,” says Eric Gries, CEO, Lucid Imagination. “Companies are facing explosive growth in the volume and diversity of data, fueling demand for enterprise-ready search capabilities without the costly lock-in of proprietary commercial search. This conference will be essential for search application developers who want to build on their expertise for Lucene/Solr search-enabled applications.”

Search Technology experts from across Europe and the world will be demonstrating how Lucene/Solr delivers more powerful, flexible, scalable and innovative search applications. Companies as diverse as AT&T, Zappos, Linked In, Zappos, Nordjyske Media, The Guardian, along with nearly 4000 other organizations worldwide all use Lucene/Solr. Notable speakers at the conference includes Matt McAlister, Head of the Developer Network, The Guardian, who will speak to use of Lucene/Solr-driven content provisioning to enable new business models and accelerate revenue. Zack Urlocker, former EVP of Products at MySQL, will provide insight as to how combined innovations from open source, search, big data and cloud technology are fomenting creative disruption in IT.

The conference has been designed with the developer community in mind.  The agenda of speakers, workshops and training sessions addresses provides a great opportunity for the Apache Lucene/Solr and Open Source developer community to come together and learn what’s new, get practical advice in intensive sessions on Lucene/Solr search application development from experienced real-world developers, and network with other community members. Co-sponsors include Findwise Technologies, SourceSense and KippData. Additional featured speakers from Lucid Imagination include Grant Ingersoll, Chair of the Lucene Project Management Committee and Yonik Seeley, creator of Solr; Erik Hatcher, author of Lucene in Action; along with Andrzej Bialecki , Apache Nutch project lead. Experts from Day Software, VRT-medialab; Nokia; IBM and over a dozen other companies will also deliver sessions.

For more information on Lucid Imagination or the conference, please visit Fee for General 2 Day Conference Pass is €545, with an early bird discount to €395 if registered by 27 April. Two training courses are offered at two days each, Solr Application Development Workshop and Lucene Bootcamp, for €995. A conference Pass & Training Bundle is available for €1295, a savings of €245.

Apache Chemistry Gains New Contributors via OpenCMIS

Just recently, we reported on Nuxeo’s (site) steady progress with Apache Chemistry, a Java implementation of the CMIS spec.

The newest development on this front is OpenCMIS (a project led by Alfresco, SAP and Open Text) that is adding their collection of libraries, frameworks and tools around CMIS to Apache Chemistry.

No, it is *not* an attack against Chemistry, but more of a friendly merger.

Recap on Apache Chemistry

Initiated by Day Software (see our interview with CTO David Nuescheler), Sourcesense and Nuxeo, Apache Chemistry started as a proposal for a new sandbox called ‘jcr-cmis’ in Apache Jackrabbit. Chemistry entered Apache incubation in April 2009.

Java-centric Apache Chemistry includes:

  • a high level API
  • a low level SPI
  • generic implementations of clients and servers for AtomPub and SOAP bindings
  • sample backends to serve data from repositories

Chemistry now targets CMIS 1.0 CD 05 draft, soon to be 06.

What is OpenCMIS and How the Two Come Together

OpenCMIS (dating back to summer 2009) is a community of folks employed by Alfresco, SAP and Open Text with the usual suspects as initial committers:

  • Florian Mueller (Open Text)
  • Jens Huebel (Open Text)
  • David Caruana (Alfresco)
  • David Ward (Alfresco)
  • Martin Hermes (SAP)
  • Stephan Klevenz (SAP)
  • Paul Goetz (SAP)

The goal of this CMIS implementation is to provide an enterprise-ready client library for Java that was missing in the existing CMIS prototypes, according to Open Text’s Florian Mueller.

Mueller describes the OpenCMIS architecture as follows, pointing out some differentiators between OpenCMIS and Chemistry:

  • There are two layers in OpenCMIS: the provider layer and the client layer.
  • The provider layer implements CMIS bindings. The opencmis-provider-api maps the CMIS domain model, handles immutable data objects (while chemistry-api follows an object-oriented approach)
  • The client layer, being on top of the provider layer, is a Java-like interface with all the classes and methods expected in an object-oriented interface
  • Chemistry uses Abdera to communicate with the server, and OpenCMIS is based on JAX-B and some CMIS-specific XML coding
  • OpenCMIS has a caching infrastructure that is specific to CMIS and OpenCMIS

As Muller notes, “The overall architecture and principals below the API are very, very different. Bringing both together would require philosophy changes on both sides. I'm not saying that this isn't possible, but it's a lengthy process.

Later on and more optimistically, Day’s Paolo Mottadelli describes in his blog OpenCMIS as “the last blast for Chemistry; the other big thing of the beginning of 2010,” as OpenCMIS joins Apache Chemistry with a request to merge the two codebases on the Apache Maven infrastructure. OpenCMIS, by the way, also uses other Apache products, such as Commons Codec and Commons Logging.

This merge is definitely an advancement in open source CMIS efforts on server and client sides, and covers different areas of the project, including:

  • Low level CMIS client library with support for AtomPub and Web Services bindings
  • High level CMIS client library sitting on top of the low level client with Java API (still in development)
  • CMIS server handling CMIS bindings on the server side and mapping them to a common set of Java interfaces
  • InMemory test repository for the CMIS server. A file system based test repository is under development and should be available soon
  • CMIS browser (currently, AtomPub only) for access to CMIS-enabled repositories

Nuescheler referred to OpenCMIS as “well architected and already very mature in its code base.” Even though CMIS is not even an official standard yet (the second round of public review ending today), and these two projects come from different backgrounds, this joint venture looks like a good approach to collaboration, improving the code and helping spring CMIS adoption into the masses.