Steve: Developing on the Edge - Sometimes Maven repositories work.
Steve: Developing on the Edge
Thoughts on development, Web-services, technology and mountains.
4Jan
Wed2006
Sometimes Maven repositories work.

Sometimes Maven repostories and pom files work, and you can add a new complex library just by declaring a dependency

Those days are profound, and give a taste of the possible, a taste to be treasured. For example hibernate-3.1 is up there, with all its dependencies. OK. maybe too many, it pulls in commons-logging and not commons-logging-api, so I get logkit and bits of avalon, but it does, impressively, pull in everything needed to build (I will check running shortly).

The trouble is, those times when it works are not frequent enough. More often than not I end up rummaging in the local repository to delete .pom files to trigger a reload, adding more servers to my fetch list, or trying to make do with outdated versions of JARs because current releases aren't there. Then there is creating entries in a private repository of the javax.* packages, or editing the exclude options to exclude things I don't want.

This is a shame as it makes it that much harder to work with a library.

How should repositories work?

  1. All OSS projects shipping versions should be there, with source and javadoc artifacts too. that way IDEs can demand fetch the jars.
  2. All JARs should be signed, although there are consequences here.
  3. All Sun javax. jars should be available too. If APIs, with source. These should also be signed. Indeed, why don't Sun sign javax packages?
  4. Repository searching should be more integrated. I have that maven repo search tool on my bookmark tab, but it doesnt do enough. It cannot find any jtidy artifacts today, even though ibiblio hosts one version.
  5. POM consistency should be enforced. When you view an artifact you should be able to see what it depends on (like rpm tools do). Any unobtainable artifact should be immediately visible. Indeed, there should be a bias against allowing anybody to publish anything to the repository that contains an unsatisfiable dependency, as it is then inherently unusable.
  6. Some of the common poms should be cleaned up. For example, commons-logging should not include any implementation stuff
  7. Indexing artifacts by hash code should be allowable. That is, instead of asking for a version, I ought be be able to declare a SHA1 checksum and have that used for retrieval. When POMs are brought in to the repository, anything without a checksum in a dependency should get it added to the POM. Why so? It enforces security. Oh and these checksums should excluded JAR signatures - signing a JAR shouldnt change its checksum.
  8. Have an announcement RSS feed of JAR updates. This lets you watch what has changed.
  9. Require all web sites that mirror the content to return a 404 on a missing artifact, not, like planet-mirror, 200 and a search page.

Now, what would I do the app?

Overall, I would treat library management as more than just a build time option. It is runtime too. I'd also make creating dependency metadata easier. It's more than just a bit of a single build tool, be it Ant, Maven or an IDE.
  • Have an easy way to view/audit the entire dependency graph of an app
  • Make it trivial to extract manifest/war data from the dependency graph
  • Have a "no-snapshots" build option that will halt the build if there is a dependency on a snapshot release. this is to stop you shipping code based on other people's unreleased stuff.
  • You should have a validate/force option on the client code that verifies that all the JAR files are still there and consistent. Otherwise a snapshot may have gone away, but you don't notice as "clean" builds don't trigger downloads again.
  • Integrate Java Web Start with it. If you depend on JARs, they get pulled in from the big repository set, validated and shared. Your app still only gets local access. Oh, and I'd allow for JAR files to have a marker to say "No JWS", a kind of revoke-option, so that you can serve up old (insecure) versions but not leave open back doors. Adding a no-jws file would be one trick.
  • Make it easy to install and use out the box. The installer should prompt for proxy configuration, provide a browser for the files and status.
  • I'd like to be able to walk backwards from a JAR and see who uses it, and whether it is safe to delete.
  • Have much better ant integration than the current set of M2 tasks. you can see that Ant support is an afterthought. Logging is different, pom files can't be created inline easily, diagnosis is a pain. Also, I dont think they get used much. I seem to be the only person in Gump using them, and the person who finds problems. They dont use ant-unit or ant-testutils for testing, either.
  • It would be nice to extend ant so that inline paths (like in a java task) could declare a repository list inline too. Essentially we'd need a new datatype inside a path, or somehow fit a list of libraries into a resource set.
  • The other big thing to do with client code is make it robust against network problems. Proxy servers returning custom error pages, transient network failures, patchy DNS. These are all things we encounter.
  • Parallel download. If the server is the bottleneck, do some gets in parallel, maybe off different boxes. This is feature creep.

Finally, it needs to be seen as important for people to put good artifacts and pom files up, for old and new versions. It needs to be easy, and it needs to be encouraged. Right now too many projects don't care; it's like 2000 when ant files and junit tests were still an intermittent feature of OSS java projects.

These are just thoughts. I have no plans to act on them (yet). I do find the m2 tasks and repository hard to work with, and am debating a quick investigation of Ivy. Maybe focusing on one thing -library management- has let them do a better job than trying to be all of a next generation build tool. I also have to appreciate the effort and responsiveness of the maven team, and admire the fact that they have build a great repository with scalability through mirrors. The main group I have issues with are Sun and their redistribution licenses. Those licenses went in after Visual J++ 1.1 added new things to the java.* packages, and left RMI out. Sun: those days are over. Move on. Think of what you have to do for Java to survive.

Comments

Re: Sometimes Maven repositories workreply to this thread
On 5 January 2006 at 00: 12 Brett Porter commented:
Thanks Steve! Reading over this, I generally just said +1 to myself, we pretty much agree and if I had more hours in the day they'd all be done already.
Specifically:
1) we are starting to ship javadoc and sources and ask for them in uploads
2) still debatable. Improving security is definitely a priority though.
3) yes, we'll keep wishing for that :)
4) the Maven project is writing a repository search tool (expected completion: real soon now)
5) above repository manager will report on gaps and also help with preventing the uploads
6) We have cleaned up the common poms. Commons logging I thought had made those optional already - time to check. We address anything filed ASAP.
7) Part of the repository manager, not sure whether we'll integrate that into the dependency resolution though
8) part of the repository manager
9) yeah, that was a surprise. We *still* need to deal with it better, but I'll get onto planetmirror ASAP.
On the runtime - there are some people that have discussed that and we integrate the dependency resolver into other apps. It would be good to share these ideas.
On the ant tasks - not so much an afterthought, but that its not my speciality. I should have put more time into them.
Parallel downloads are a good idea, but would require some changes. We don't resolve until necessary - you'd get more benefit from queueing them all at the beginning and doing it in the background while the build progresses.
I think we've made it a lot easier for people to put up new POMs and for OSS projects to do it themselves. It is definitely encouraged. Any specific tips are welcome :) I agree that a lot of projects don't care. They need to realise this affects *their* users not just *Maven* users.
As for Ivy, focusing on library management does make them strong in that suit, and we've had discussions with them in the past. They still very much rely on the existence of the Maven repository to function.
Maven is still with them feature for feature, and I found similar usability problems using Ivy when I had to build WebWork. I'd be interested to hear your experiences with it. As far as the repo goes - they have very few transitive dependencies centralised (about 40), so they obviously work well if you can find what you need, because they are hand crafted, but I'm not sure that it scales to large numbers of artifacts, or letting projects do it themselves.
On 16 April 2007 at 21: 23 Justin Lee commented:
The sun redistribution rules are greatly relaxed these days: https://maven-repository.dev.java.net/nonav/repository/
Older versions probably won't be going up anytime soon, but the newer stuff can already be found. Especially for the EE libs as most of those come from glassfish anyway.