One thing that we are asked from time to time is if there is an Eclipse plugin for bzr. At the moment, there is a project which has been started: bzr-eclipse
It is still in the very early stages, but it seems there is enough interest, so I figured I would explore the space a bit.
One issue is trying to figure out how to communicate between bzr (written in Python) and Eclipse (written in Java).
One obvious method is to just write Java code which calls out to bzr the command line program, and then parses the string output from stdout and stderr. This can work, but bzr isn't especially scriptable. It can be scripted, but it is more focused on being something that is nice to use for a human than something that is easy to parse for a machine.
We have a much richer machine api in bzrlib the python library which is the guts of bzr. Wouldn't it be nice if we could get direct access to this rich API.
Well, there are two projects that I know of Jython and one I just heard about JEPP (Java Embedded Python).
Jython has the goal of running python code directly on the Java Virtual Machine. I'm not sure of everything that this entails, but my understanding is that it is basically writing a compiler that turns Python code into Java bytecodes. I have high hopes for this project, but at the moment it only supports Python 2.3 syntax (if you use the current beta). Unfortunately bzr is written with Python 2.4 syntax in mind. (We use decorators a lot and some generator comprehensions).
The other (major?) limitation is that Jython doesn't have a good way to support "os.chdir()". And while our general code doesn't actually use it, out test suite makes heavy use of "os.chdir()" to make sure that each test runs in isolation. Other limitations include not having a complete python standard library. Again, we use subprocess in the test suite when we want to ensure a clean run of bzr. We also use logging. There is also some concern about C extensions. At the moment, bzr is written in 100% python code, but as we finalize our data structures, we would like to implement any heavy processing loops in C/C++ (or possibly pyrex, which compiles to C).
But we could probably work around most of the missing functionality. The biggest thing is just Python 2.4 compatibility.
But this week I was exposed to Java Embedded Python or JEPP. Which takes the other approach. Rather than implementing the Python language in Java, just embed a CPython interpreter in a Java process.
This means you can use whatever CPython you have available on your system (2.3, 2.4, 2.5?). And you are sure to have access to the full standard library, extensions should never be a problem, etc.
The only real limitation of this approach is figuring out how well you can expose the embedded CPython interpreter. At a basic level, it isn't much different than calling 'python -c "do something"'. But it is possible to create a richer interaction between the CPython interpreter and the JVM, which is what JEPP is trying to do.
I played with JEPP today, and I think it is a really good start. It isn't functional enough yet that I would use it for a large project. But it seems almost there. At the moment it is able to return integers, floats, longs, and strings. But it isn't able to pass back and forth Python objects.
It does let you do stuff like:
Jep jep = new Jep(false, ".");
jep.runScript("a_python_script.py");
An the script can have quite a bit of logic. The script is run as '__main__', and the variables, functions, etc are in the running namespace. So you can do stuff like:
Object value = jep.getValue("variable");
or
Object ret = jep.invoke("a_function", "param1", 2, 3);
If "a_function" returns a "basic" type (int, long, float, str), then the returned Java Object is a Integer, Float, String, etc.
The only thing that doesn't work well is when the returned object is not a basic type. The code falls back to the catch-all, which converts everything to a string. I don't think this is the long term plan for the project, because they have a "PyObject" Java class.
I would expect the PyObject class to develop functions similar to Boost::Python's boost::python::object class.
I don't know if they will end up exposing as much of the api (slice is a nice convenience function, but logically maybe it shouldn't be on object), but ones like attr would certainly be useful. (As they also give you a way to call member functions, etc).
I know Boost does a lot of work behind the scenes with templates, and Java doesn't have the same functionality. I don't know if Java "Generics" are up to the task of PyObject(function).
Now I just have to figure out how to get commit notifications for a Sourceforge SVN project, so I can watch it evolve. :)
Friday, March 30, 2007
Tuesday, March 27, 2007
Test DRIVEN Development
For the Bazaar project we have a general goal that all code should be tested. We have an explicit development workflow that all new code must have associated tests before it can be merged into mainline.
Our latest release (0.15, rc3 is currently out the door, final should happen next week), introduces a new disk format, and a plethora of new tests. ('bzr selftest' in 0.14 has approx 4400 tests, and 5900 in 0.15). A lot of these are interface tests. Since we support multiple working tree, branch, and repository formats, we want to make sure that they all work the same way. (So only 1 tests is written, but it may be run against 4 or 5 different formats).
It means that we have a very good feeling that our code is doing what we expect it to (all of the major developers dogfood on the current mainline). However, it comes at a bit of a cost. In that running the full test suite gets slower and slower.
Further, I personally follow more of a 'test-after-development'. And I'm trying to get into the test driven development mindset. I don't know how I feel just yet, but I was reading this. And whether you agree with all of it, it makes it pretty clear how different the mindset can be. It goes through several iterations of testing, coding, and refactoring before it ends up anywhere I consider "realistic". And a lot of that comes at the 'refactoring' step, not at the coding step.
I have a glimpse at how it could be useful, as the idea is to have very small iterations. Such that it can be done in the 3-5 minute range. And every 3-5 minutes you should have a new test which passes. It means that you frequently have hard-coded defaults, since that is all the tests require at this point. But it might also help you design an interface, without worrying about actually implementing everything.
He also makes comments about keeping a TODO list. Which was part that made the most sense to me. Because you can't every write all the code fast enough to get all the ideas out of your head. So you keep a TODO so you don't forget, and also so you don't feel like you need to track down that path right now.
The other points that stuck with me are that most tests should be "unit tests". Which by his definition means they are memory only very narrow in scope. And that the test suite should be fast to run, because once it gets under a threshold (his comment was around 10 seconds, not minutes) then you can actually run all of them, all the time.
And since a development 'chunk' is supposed to be 3-5 minutes, it is pretty important that the test suite only take seconds to run. The 10s mark is actually reasonable, because it is about as long as you would be willing to give to that single task. Any longer and you are going to be context switching (email, more code, IRC, whatever).
The next level of test-size that he mentions is an "integration" test. I personally prefer the term "functional" test. But the idea is that a "unit" test should be testing the object (unit) under focus, and nothing else. Versus a functional test that might make use of other objects, and disk, database, whatever. And then the top level is doing an "end-to-end" test. Where you do the whole setup, test, and tear down. And these have purpose (like for conformance testing, or use case testing), but they really shouldn't be the bulk of your tests. If there is a problem here, it means your lower level tests are incomplete. They are good from a "the customer wants to be able to do 'X', this test shows that we do it the way they want" viewpoint.
I think I would like to try real TDD sometime, just to get the experience of it. I'll probably try it out on my next plugin, or some other small script I write. I have glimpses of how these sorts of things could be great. Often I'm not sure how to proceed while developing because the idea hasn't solidified in my head. One possibility here is "don't worry about it", create a test for what you think you want, stub out what you have to, and get something working.
Of course, the more I read, the more questions spring up. For example, there is a lot of discussion about test frameworks. Python comes with 'unittest', which is based on the general JUnit (or is it SUnit) framework. Where you subclass from a TestCase base class, and have a setUp(), and tearDown(), and a bunch of test_foo() tests.
But there is also nose and py.test, which both try to overcome unittest's limitations. And through reading about them, there is a discussion that python 3000 will actually have a slightly different default testing library. (For a sundry of technical and political reasons).
And then there is the mock versus stub debate. As near as I can tell, it seems to fall around how to create a unit test when the object under test depends on another object. And which method is more robust, easier to maintain, and easier to understand. That link lends some interesting thought about Mock objects. That instead of testing the state of objects, you are actually making an explicit assertion that the object being tested will make specific calls on the dependency.
I'm not settled on my decision, there. Because it feels like you are testing an exact implementation, rather than testing the side effect (interface). Some of what I read says "yes, that is what you are doing, and that is the point." I can understand testing side-effects. I guess part of it is how comfortable are you with having your test suite evolve. At least some tests need to be there to say that the interface hasn't changed since the previous release. (Or that a bug hasn't been reintroduced). If that edge case was tested by a particular test, and that test gets refactored, do you have confidence you didn't re-introduce the bug?
I guess you could have specific conventions about what tests are testing the current implementation, versus the overall interface of a function or class. I can understand that you want your test suite to evolve and stay maintainable. But at the other end, it is meant to test that things are conforming to some interface, so if you change the test suite, you are potentially breaking what you meant to maintain.
Maybe it just means you need several tiers of tests, each one less likely to be refactored.
Our latest release (0.15, rc3 is currently out the door, final should happen next week), introduces a new disk format, and a plethora of new tests. ('bzr selftest' in 0.14 has approx 4400 tests, and 5900 in 0.15). A lot of these are interface tests. Since we support multiple working tree, branch, and repository formats, we want to make sure that they all work the same way. (So only 1 tests is written, but it may be run against 4 or 5 different formats).
It means that we have a very good feeling that our code is doing what we expect it to (all of the major developers dogfood on the current mainline). However, it comes at a bit of a cost. In that running the full test suite gets slower and slower.
Further, I personally follow more of a 'test-after-development'. And I'm trying to get into the test driven development mindset. I don't know how I feel just yet, but I was reading this. And whether you agree with all of it, it makes it pretty clear how different the mindset can be. It goes through several iterations of testing, coding, and refactoring before it ends up anywhere I consider "realistic". And a lot of that comes at the 'refactoring' step, not at the coding step.
I have a glimpse at how it could be useful, as the idea is to have very small iterations. Such that it can be done in the 3-5 minute range. And every 3-5 minutes you should have a new test which passes. It means that you frequently have hard-coded defaults, since that is all the tests require at this point. But it might also help you design an interface, without worrying about actually implementing everything.
He also makes comments about keeping a TODO list. Which was part that made the most sense to me. Because you can't every write all the code fast enough to get all the ideas out of your head. So you keep a TODO so you don't forget, and also so you don't feel like you need to track down that path right now.
The other points that stuck with me are that most tests should be "unit tests". Which by his definition means they are memory only very narrow in scope. And that the test suite should be fast to run, because once it gets under a threshold (his comment was around 10 seconds, not minutes) then you can actually run all of them, all the time.
And since a development 'chunk' is supposed to be 3-5 minutes, it is pretty important that the test suite only take seconds to run. The 10s mark is actually reasonable, because it is about as long as you would be willing to give to that single task. Any longer and you are going to be context switching (email, more code, IRC, whatever).
The next level of test-size that he mentions is an "integration" test. I personally prefer the term "functional" test. But the idea is that a "unit" test should be testing the object (unit) under focus, and nothing else. Versus a functional test that might make use of other objects, and disk, database, whatever. And then the top level is doing an "end-to-end" test. Where you do the whole setup, test, and tear down. And these have purpose (like for conformance testing, or use case testing), but they really shouldn't be the bulk of your tests. If there is a problem here, it means your lower level tests are incomplete. They are good from a "the customer wants to be able to do 'X', this test shows that we do it the way they want" viewpoint.
I think I would like to try real TDD sometime, just to get the experience of it. I'll probably try it out on my next plugin, or some other small script I write. I have glimpses of how these sorts of things could be great. Often I'm not sure how to proceed while developing because the idea hasn't solidified in my head. One possibility here is "don't worry about it", create a test for what you think you want, stub out what you have to, and get something working.
Of course, the more I read, the more questions spring up. For example, there is a lot of discussion about test frameworks. Python comes with 'unittest', which is based on the general JUnit (or is it SUnit) framework. Where you subclass from a TestCase base class, and have a setUp(), and tearDown(), and a bunch of test_foo() tests.
But there is also nose and py.test, which both try to overcome unittest's limitations. And through reading about them, there is a discussion that python 3000 will actually have a slightly different default testing library. (For a sundry of technical and political reasons).
And then there is the mock versus stub debate. As near as I can tell, it seems to fall around how to create a unit test when the object under test depends on another object. And which method is more robust, easier to maintain, and easier to understand. That link lends some interesting thought about Mock objects. That instead of testing the state of objects, you are actually making an explicit assertion that the object being tested will make specific calls on the dependency.
I'm not settled on my decision, there. Because it feels like you are testing an exact implementation, rather than testing the side effect (interface). Some of what I read says "yes, that is what you are doing, and that is the point." I can understand testing side-effects. I guess part of it is how comfortable are you with having your test suite evolve. At least some tests need to be there to say that the interface hasn't changed since the previous release. (Or that a bug hasn't been reintroduced). If that edge case was tested by a particular test, and that test gets refactored, do you have confidence you didn't re-introduce the bug?
I guess you could have specific conventions about what tests are testing the current implementation, versus the overall interface of a function or class. I can understand that you want your test suite to evolve and stay maintainable. But at the other end, it is meant to test that things are conforming to some interface, so if you change the test suite, you are potentially breaking what you meant to maintain.
Maybe it just means you need several tiers of tests, each one less likely to be refactored.
Wednesday, March 14, 2007
Reading and Writing to Files ('r+', 'w+' mode) on Windows
It turns out that Windows has a small oddity when reading and writing to the a file. It is reported in the 'fopen' documentation at MSDN:
http://msdn2.microsoft.com/en-us/library/yeby3zcb(vs.71).aspx
The specific quote is:
On most platforms, that succeeds. But on Windows, if you don't do
before you call f.write(), you will get an IOError, with e.errno = 0. (Yeah, having an error of SUCCESS is a little hard to figure out).
Anyway, it took a while for me to figure out, so I figured I'd let other people know.
http://msdn2.microsoft.com/en-us/library/yeby3zcb(vs.71).aspx
The specific quote is:
When the "r+", "w+", or "a+" access type is specified, both reading and writing are allowed (the file is said to be open for "update"). However, when you switch between reading and writing, there must be an intervening fflush, fsetpos, fseek, or rewind operation. The current position can be specified for the fsetpos or fseek operation, if desired.As an example, here is what you might do in python:
>>> f = open('test', 'wb+')
>>> f.write('initial text\n')
>>> f.close()
>>> f = open('test', 'rb+')
>>> f.read()
'initial text\n'
>>> f.write('this should go at the end\n')
On most platforms, that succeeds. But on Windows, if you don't do
>>> f.seek(0, 2) # Seek to the end of the file
before you call f.write(), you will get an IOError, with e.errno = 0. (Yeah, having an error of SUCCESS is a little hard to figure out).
Anyway, it took a while for me to figure out, so I figured I'd let other people know.
11 Steps to creating a new Launchpad Project
I frequently create new projects in launchpad, as I generally have a new "product" for every plugin that I write. I figured I would write down the specific steps I use, because there are a few non-obvious links that you need to use.
0) One quick point of terminology. Launchpad has the idea of "projects" and "products". A "project" is a collection of "products". For example we have the Bazaar project, which includes the "bzr" program, as well as plugins for "bzr". It is a little foreign to me, since I consider what I work on a "project" rather than a "product". But I understand the need for a higher level grouping, and I can't say that I have better names to distinguish them.
Also, each product gets a set of "series". These are generally meant along the lines of "release series". Most projects will have a development series (by default this is called "trunk"), and possibly some release series. Especially large projects, which will have concurrent development (think of Firefox, which has a 2.0 series, and an 1.5 series, since you get 1.5.1 and 2.0.1).
1) Go to Launchpad itself: https://launchpad.net
2) Go to the products page https://launchpad.net/products
The link on the main page is "register your product".
3) If this is an existing project, you probably want to search and make sure it isn't already registered in Launchpad. In my case, these are always new projects, so I don't worry about it.
4) Follow the "Register a Product" link on the upper left (https://launchpad.net/products/+new).
5) Fill out the basic information for the product. In my case, most of my products fall under the "bazaar" project banner. When creating a new plugin for "bzr", the general convention is to call it "bzr-plugin-name". It certainly isn't required, it is just a convention that I've tried to follow.
6) Change the product to use Malone (Launchpad's bug tracker) as the official bug tracker. This is the link "Define Launchpad Usage" on the left. (https://launchpad.net/PRODUCT/+launchpad). You may also enable Rosetta translations at this time.
7) Change the Maintainer of the product to a shared group. I usually want other people to be able to update the details of the product, update the bug tracker, etc. So I set the project as "owned" by the "bzr" group. That is done by following the "Change Maintainer" link (https://launchpad.net/PRODUCT/+reassign).
8) Now you want to create a Bazaar branch for the mainline of the project. You can do this through the "Register Branch" links. I personally tend to host my branches on Launchpad itself (hosting is free, and it is bandwidth I don't need to pay for). So I do a simple:
A bit of explanation, username must be your launchpad user name, and "~bzr" can be either your username, or the name of the group in step 7. As I mentioned, I prefer the mainline to be a shared branch, so other people can update the mainline if I'm too busy, or cannot be contacted for some reason.
9) Now update the "trunk" series to point to this new branch. There should be a link on the main page (https://launchpad.net/PRODUCT) to the "trunk" series. Or you can link more directly to it at (https://launchpad.net/PRODUCT/trunk).
You want to "Change Series Details" for this series (https://launchpad.net/PRODUCT/trunk/+edit).
10) At this point, you can change the name of the series (maybe you prefer "mainline" over "trunk"). You also can change the description. I usually leave them alone. What I do change is the "Branch". I generally follow the "Choose" link, which lets me search through all branches registered for this product. (Note, pushing to sftp://bazaar.launchpad.net/~USER/PRODUCT/BRANCH-NAME, will automatically register the branch)
11) And you're done. It took a little while, but now you have a fully functioning bug tracker and branch tracker. You are also able to tell people to get your product with:
And they will get the latest development version.
By registering your branches, you now have the ability to link them with bugs, so that users who find a bug, can see that there is already a fix, even if it hasn't been included in mainline yet.
(Edited to fix "Product" versus "Project")
0) One quick point of terminology. Launchpad has the idea of "projects" and "products". A "project" is a collection of "products". For example we have the Bazaar project, which includes the "bzr" program, as well as plugins for "bzr". It is a little foreign to me, since I consider what I work on a "project" rather than a "product". But I understand the need for a higher level grouping, and I can't say that I have better names to distinguish them.
Also, each product gets a set of "series". These are generally meant along the lines of "release series". Most projects will have a development series (by default this is called "trunk"), and possibly some release series. Especially large projects, which will have concurrent development (think of Firefox, which has a 2.0 series, and an 1.5 series, since you get 1.5.1 and 2.0.1).
1) Go to Launchpad itself: https://launchpad.net
2) Go to the products page https://launchpad.net/products
The link on the main page is "register your product".
3) If this is an existing project, you probably want to search and make sure it isn't already registered in Launchpad. In my case, these are always new projects, so I don't worry about it.
4) Follow the "Register a Product" link on the upper left (https://launchpad.net/products/+new).
5) Fill out the basic information for the product. In my case, most of my products fall under the "bazaar" project banner. When creating a new plugin for "bzr", the general convention is to call it "bzr-plugin-name". It certainly isn't required, it is just a convention that I've tried to follow.
6) Change the product to use Malone (Launchpad's bug tracker) as the official bug tracker. This is the link "Define Launchpad Usage" on the left. (https://launchpad.net/PRODUCT/+launchpad). You may also enable Rosetta translations at this time.
7) Change the Maintainer of the product to a shared group. I usually want other people to be able to update the details of the product, update the bug tracker, etc. So I set the project as "owned" by the "bzr" group. That is done by following the "Change Maintainer" link (https://launchpad.net/PRODUCT/+reassign).
8) Now you want to create a Bazaar branch for the mainline of the project. You can do this through the "Register Branch" links. I personally tend to host my branches on Launchpad itself (hosting is free, and it is bandwidth I don't need to pay for). So I do a simple:
cd $local_branch
bzr push sftp://user@bazaar.launchpad.net/~bzr/PRODUCT/trunk
A bit of explanation, username must be your launchpad user name, and "~bzr" can be either your username, or the name of the group in step 7. As I mentioned, I prefer the mainline to be a shared branch, so other people can update the mainline if I'm too busy, or cannot be contacted for some reason.
9) Now update the "trunk" series to point to this new branch. There should be a link on the main page (https://launchpad.net/PRODUCT) to the "trunk" series. Or you can link more directly to it at (https://launchpad.net/PRODUCT/trunk).
You want to "Change Series Details" for this series (https://launchpad.net/PRODUCT/trunk/+edit).
10) At this point, you can change the name of the series (maybe you prefer "mainline" over "trunk"). You also can change the description. I usually leave them alone. What I do change is the "Branch". I generally follow the "Choose" link, which lets me search through all branches registered for this product. (Note, pushing to sftp://bazaar.launchpad.net/~USER/PRODUCT/BRANCH-NAME, will automatically register the branch)
11) And you're done. It took a little while, but now you have a fully functioning bug tracker and branch tracker. You are also able to tell people to get your product with:
bzr branch lp:PRODUCT PRODUCT
And they will get the latest development version.
By registering your branches, you now have the ability to link them with bugs, so that users who find a bug, can see that there is already a fix, even if it hasn't been included in mainline yet.
(Edited to fix "Product" versus "Project")
Friday, March 2, 2007
Dirstate: another 2x performance boost
Another round of performance optimizations in the dirstate code brings us down from 15s down to 8s to do a complete 'bzr status' in a tree with 50,000 files and 5,000 directories. (bzr 0.14 takes approx 30s on the same tree).
There were a few tricks and a few cleanups.
1) Make one pass over the filesystem, rather than 2. We were making a second pass to check for unknown files rather than determining that in the first pass. It doesn't take as long as the first pass, since things are usually cached, but it is work that doesn't need to be done.
2) Work in raw filesystem paths when possible.
Internally in bzr we try to work in Unicode strings as much as possible. It makes things consistent across platforms (I can check in a file called جوجو.txt on windows, and have it show up with the correct filename on Mac and Linux). In fact, on Windows you need to use the Unicode api if you want to get the correct filenames. (They have an OEM name and a Unicode name, but if the characters are not in your codepage you get ????.txt in OEM mode).
However on Linux, if you want to use Unicode filenames, it has to decode every name that it finds (the difference between os.listdir('.') and os.listdir(u'.')).
With the dirstate refactoring, we now have a layer that can work in utf8 paths to find changes before it goes up to the next layer which can deal in Unicode paths for simplicity.
3) There is still more we can do. We are trying to continue doing this as a series of correctness preserving steps. But I am happy to say that we are getting some very good results after the last few months of effort. I honestly didn't think that the performance benefits would be this great this early.
There were a few tricks and a few cleanups.
1) Make one pass over the filesystem, rather than 2. We were making a second pass to check for unknown files rather than determining that in the first pass. It doesn't take as long as the first pass, since things are usually cached, but it is work that doesn't need to be done.
2) Work in raw filesystem paths when possible.
Internally in bzr we try to work in Unicode strings as much as possible. It makes things consistent across platforms (I can check in a file called جوجو.txt on windows, and have it show up with the correct filename on Mac and Linux). In fact, on Windows you need to use the Unicode api if you want to get the correct filenames. (They have an OEM name and a Unicode name, but if the characters are not in your codepage you get ????.txt in OEM mode).
However on Linux, if you want to use Unicode filenames, it has to decode every name that it finds (the difference between os.listdir('.') and os.listdir(u'.')).
With the dirstate refactoring, we now have a layer that can work in utf8 paths to find changes before it goes up to the next layer which can deal in Unicode paths for simplicity.
3) There is still more we can do. We are trying to continue doing this as a series of correctness preserving steps. But I am happy to say that we are getting some very good results after the last few months of effort. I honestly didn't think that the performance benefits would be this great this early.
Subscribe to:
Posts (Atom)