Directly from the Apache XML project website, its goals are:
The project homepage is located at http://xml.apache.org. It is an umbrella for a variety of subprojects.
This is a quick introduction to XML. To know more about XML, a good starting point is http://www.xml.com. XML is a markup language (think HTML) for describing structured content using tags and attributes. Once content is separated from presentation, you can choose how to display (cellphone, html, text) or exchange it. The XML standard only describes how the tags and attributes can be arranged, not its names of what they mean. Apache provides the tools described in the following sections.
The Xerces project provides XML parsers for a variety of languages, including Java, C++ and Perl. The Perl bindings are based on the C++ sources. There are Tcl bindings for Xerces in the 2.0 version of TclXML, by Steve Ball. This 2.0 version is available thru the SourceForge project page. An XML parser is a tool used for programatic access to XML documents. This is a description of the standards supported by Xerces:
Xalan is an XSLT processor available for Java and C++. XSL is a style sheet language for XML. The T is for Transformation. XML is good at storing structured data (information). We sometimes need to display this data to the user or apply some other transformation. Xalan takes the original XML document, reads transformation configuration (stylesheet) and outputs HTML, plain text or another XML document. You can learn more about Xalan at the Xalan Java and Xalan C++ project homepages.
From the website: FOP is a Java application that reads a formatting object tree and then turns it into a PDF document. So FOP takes an XML document and outputs PDF, in a similar way that Xalan does with HTML or text. You can learn more about FOP here.
Cocoon leverages other Apache XML technologies like Xerces, Xalan and FOP to provide a comprehensive publishing framework. Cocoon is based around XML and XSL and targeted to sites of medium - high complexity. It separates content, logic and presentation as described in the website:
Apache SOAP ("Simple Object Access Protocol") is an implementation of the SOAP submission to W3C. It is based on, and supersedes, the IBM SOAP4J implementation.
From the draft W3C specification: SOAP is a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML based protocol that consists of three parts:
Batik is a Java based toolkit for applications that want to use images in the Scalable Vector Graphics (SVG) format for various purposes, such as viewing, generation or manipulation.
It is XML centric and compliant with the W3C specification. It is a bit atypical from other Apache projects, in that it provides a graphical component. Batik provides hooks to extend the framework thru custom tags and it allows conversion from SVG to other formats like JPEG or PNG.
Crimson is an alternative, Java-based, XML parser with support for XML 1.0 thru a variety of interfaces. It is the parser currently shipping in Sun products, and an intermediate step until the version 2 of Xerces is released.
There are other projects based on Apache and XML that do not live under the Apache XML umbrella