|
Hudson supports the "master/slave" mode, where the workload of building projects are delegated to multiple "slave" nodes, allowing single Hudson installation to host a large number of projects, or provide different environments needed for builds/tests. This document describes this mode and how to use it. Contents
How does this work?A "master" is an installation of Hudson. When you weren't using the master/slave support, a master was all you had. Even in the master/slave mode, the role of a master remains the same. It will serve all HTTP requests, and it can still build projects on its own. Slaves are computers that are set up to build projects for a master. Hudson runs a separate program called "slave agent" on slaves. There are various ways to start slave agents, but in the end a slave agent and Hudson master needs to establish a bi-directional byte stream (for example a TCP/IP socket.) When slaves are registered to a master, a master starts distributing loads to slaves. The exact delegation behavior depends on configuration of each project. Some projects may choose to "stick" to a particular machine for a build, while others may choose to roam freely between slaves. For people accessing Hudson website, things works mostly transparently. You can still browse javadoc, see test results, download build results from a master, without ever noticing that builds were done by slaves. Follow the Step by step guide to set up master and slave machines to quickly start using distributed builds. Different ways of starting slave agentsPick the right method depending on your environment and OS that master/slaves run. Have master launch slave agent via sshHudson has a built-in SSH client implementation that it can use to talk to remote sshd and start a slave agent. This is the most convenient and preferred method for Unix slaves, which normally has sshd out-of-the-box. Click Manage Hudson, then Manage Nodes, then click "New Node." In this set up, you'll supply the connection information (the slave host name, user name, and ssh credential). Note that the slave will need the master's public ssh key copied to ~/.ssh/authorized_keys. (This is a decent howto if you need ssh help). Hudson will do the rest of the work by itself, including copying the binary needed for a slave agent, and starting/stopping slaves. If your project has external dependencies (like a special ~/.m2/settings.xml, or a special version of java), you'll need to set that up yourself, though. [Where is this documented?] This is the most convenient set up on Unix. Have master launch slave agent on WindowsFor Windows slaves, Hudson can use the remote management facility built into Windows 2000 or later (WMI+DCOM, to be more specific.) In this set up, you'll supply the username and the password of the user who has the administrative access to the system, and Hudson will use that remotely create a Windows service and remotely start/stop them. This is the most convenient set up on Windows, but does not allow you to run programs that require display interaction (such as GUI tests). Note : Unlike other Node's configuration type, the Node's name is very important as it is taken as the node's address where to create the service ! Write your own script to launch Hudson slavesIf the above turn-key solutions do not provide flexibility necessary, you can write your own script to start a slave. You place this script on the master, and tell Hudson to run this script whenever it needs to connect to a slave. Typically, your script uses a remote program execution mechanism like SSH, RSH, or other similar means (on Windows, this could be done by the same protocols through cygwin or tools like psexec), but Hudson doesn't really assume any specific method of connectivity. What Hudson expects from your script is that, in the end, it has to execute the slave agent program like java -jar slave.jar, on the right computer, and have its stdin/stdout connect to your script's stdin/stdout. For example, a script that does "ssh myslave java -jar ~/bin/slave.jar" would satisfy this. A copy of slave.jar can be downloaded from http://yourserver:port/jnlpJars/slave.jar . Many people write scripts in such a way that this 160K jar is downloaded during the script, to make sure the consistent version of slave.jar is always used. The SSH Slaves plugin does this automatically, so slaves configured using this plugin always use the correct slave.jar.
Launching slaves this way often requires an additional initial set up on slaves (especially on Windows, where remote login mechanism is not available out of box), but the benefits of this approach is that when the connection goes bad, you can use Hudson's web interface to re-establish the connection. Launch slave agent via Java Web StartAnother way of doing this is to start a slave agent through Java Web Start (JNLP). In this approach, you'll interactively logon to the slave node, open a browser, and open the slave page. You'll be then presented with the JNLP launch icon. Upon clicking it, Java Web Start will kick in, and it launches a slave agent on the computer where the browser was running. On Windows, you can do this manually once, then from the launched JNLP slave agent, you can install it as a Windows service so that you don't need to interactively start the slave from then on. If you need display interaction (e.g. for GUI tests) on Windows and you have a dedicated (virtual) test machine, this is a suitable option. Create a hudson user account, enable auto-login, and put a shortcut to the JNLP file in the Startup items (after having trusted the slave agent's certificate). This allows one to run tests as a restricted user as well. Launch slave agent headlesslyThis launch mode uses a mechanism very similar to Java Web Start, except that it runs without using GUI, making it convenient for an execution as a daemon on Unix. To do this, configure this slave to be a JNLP slave, take slave.jar as discussed above, and then from the slave, run a command like this: $ java -jar slave.jar -jnlpUrl http://yourserver:port/computer/slave-name/slave-agent.jnlp Make sure to replace "slave-name" with the name of your slave. Other RequirementsAlso note that the slaves are a kind of a cluster, and operating a cluster (especially a large one or heterogeneous one) is always a non-trivial task. For example, you need to make sure that all slaves have JDKs, Ant, CVS, and/or any other tools you need for builds. You need to make sure that slaves are up and running, etc. Hudson is not a clustering middleware, and therefore it doesn't make this any easier. Example: Configuration on UnixThis section describes my current set up of Hudson slaves that I use inside Sun for my day job. My master Hudson node is running on a SPARC Solaris box, and I have many SPARC Solaris slaves, Opteron Linux slaves, and a few Windows slaves.
Scheduling strategySome slaves are faster, while others are slow. Some slaves are closer (network wise) to a master, others are far away. So doing a good build distribution is a challenge. Currently, Hudson employs the following strategy:
If you have interesting ideas (or better yet, implementations), please let me know. Transition from master-only to master/slaveTypically, you start with a master-only installation and then much later you add slaves as your projects grow. When you enable the master/slave mode, Hudson automatically configures all your existing projects to stick to the master node. This is a precaution to avoid disturbing existing projects, since most likely you won't be able to configure slaves correctly without trial and error. After you configure slaves successfully, you need to individually configure projects to let them roam freely. This is tedious, but it allows you to work on one project at a time. Projects that are newly created on master/slave-enabled Hudson will be by default configured to roam freely. Master on public network, slaves within firewallOne might consider setting up the Hudson master on the public network (so that people can see it), while leaving the build slaves within the firewall (because having a lot of machines on the internet is expensive.) There are two ways to make it work:
Note that in both cases, once the master is compromised, all your slaves can be easily compromised (IOW, malicious master can execute arbitrary program on slaves), so both set-up leaves much to be desired in terms of isolating security breach. Build Publisher Plugin provides another way of doing this, in more secure fashion. Running Multiple Slaves on the Same MachineIt is possible to run multiple slave instances on a Windows machine, and have them installed as separate Windows services so they can start up on system startup. While the correct use of executors largely obviates the need for multiple slave instances on the same machine, there are some unique use cases to consider:
Follow these steps to get multiple slaves working on the same Windows box:
When you go to create the second node, it is nice to be able to copy an existing node, and copy the first node you setup. Then you just tweak the Remote FS Root and a couple other settings to make it distinct. When you are done you should have two (or more) Hudson slave services in the list of Windows services. Troubleshooting tipsSome interesting pages on issues (and resolutions) occurring when using Windows slaves:
Some more general troubleshooting tips:
Other readings
|
Comments (36)
Jun 29, 2007
Anonymous says:
You should consider expanding on the section about launching slaves via Java Web...You should consider expanding on the section about launching slaves via Java WebStart. Took me a bit to figure it out. I'll even write it up of you like.
Jun 29, 2007
Kohsuke Kawaguchi says:
Yes, please! Much appreciated.Yes, please! Much appreciated.
Aug 01, 2007
Anonymous says:
Can someone give me a hint please how to open the slave page (url) ? thx!&n...Can someone give me a hint please how to open the slave page (url) ?
thx!
Sep 20, 2007
Anonymous says:
On my master (Linux) node I have added an Ant instance which points to the /opt/...On my master (Linux) node I have added an Ant instance which points to the /opt/ant-1.7.0 directory. Now, some build can be performed only on Windows so I've defined a Windows slave spawned via JNLP. But every build will fail, because Ant is not in /opt/ant-1.7.0 but somewhere else (c:\ant or whatever).
Same question about JDK path.
Sep 20, 2007
Anonymous says:
Ok, I have found that if I put Ant into c:\opt\ant-1.7.0 it seems to work. Never...Ok, I have found that if I put Ant into c:\opt\ant-1.7.0 it seems to work. Nevertheless I think that such things could be configurable per slave. Of course plugins could be able to contribute paths to slave configurations
Oct 24, 2007
Anonymous says:
If the Hudson master is running at http://hudsonmaster:8080/hudson then you w...If the Hudson master is running at
http://hudsonmaster:8080/hudsonthen you would login to the remote slave server, open a browser and enter the above URL. On the left-hand menu, you will see Build Executor Status section with "Master" and your remote slave listed below. Click the slave name link and on the resulting page you will see the "Launch" button for Java Web Start.
Oct 10, 2007
Daniel Pike says:
Something very useful in corporate networks in the ability to run slaves as a wi...Something very useful in corporate networks in the ability to run slaves as a windows service. After a fair bit of playing I have been able to achieve this using the Tanuki software's Java service wrapper. I am happy to write something up and send it through if it would be useful?
Oct 10, 2007
Kohsuke Kawaguchi says:
Yes, by all means!Yes, by all means!
Oct 18, 2007
Daniel Pike says:
Done, sent through to Kohsuke's addressDone, sent through to Kohsuke's address
Oct 16, 2007
Jeff Black says:
Second that!Second that!
Jan 07, 2008
Anonymous says:
When hudson tries to launch a slave it complains that it cannot find maven-agent...When hudson tries to launch a slave it complains that it cannot find maven-agent.jar I don't know what maven is. What do I need to do to make hudson happy?
Jan 25, 2008
Anonymous says:
Within a *nix system, you might be able to view top or uptime - looking for ...Within a *nix system, you might be able to view top or uptime - looking for load average on a that host. Weighting it then scheduling work based on that value.
Feb 13, 2008
Anonymous says:
If I have 2 slaves build machines building the same project, is there a way to c...If I have 2 slaves build machines building the same project, is there a way to configure Hudson to utilize the build machines at the same time? for example, if machine 1 is already building and Hudson detects a change in the source code repository, can machine 2 start the build for the new checkin? That way there is no need to wait for machine 1 to finish to get feedback on the last checkin.
Jun 18, 2009
Thomas Guieu says:
I would appreciate this feature too ! Is there a way to do that ? Or to man...I would appreciate this feature too !
Is there a way to do that ? Or to manually start several builds of the same job ?
May 23, 2008
Michael Manz says:
An idea for a further rule for the scheduling strategy: 4. If a build depends o...An idea for a further rule for the scheduling strategy:
4. If a build depends on another build, try to build it on the same node that previously build the parent build.
Dec 01, 2008
Leon Franzen says:
Our organization uses Maven2. We deploy a master POM that nearly all projects d...Our organization uses Maven2. We deploy a master POM that nearly all projects directly or indirectly inherit from. The POM project is a job in Hudson. We intend to deploy the SNAPSHOT build when changes committed to the POM are picked up and when all downstream jobs pass. Does Hudson provide a mechanism for synchronizing successful slave artifact SNAPSHOT builds so that all downstream builds on different slaves use the CI-installed (as opposed to deployed) artifacts?
Optimally I would like to do the following:
1) Hudson job "master pom": mvn install
2) Do all downstream jobs (triggered by Hudson)
3) If step 2 successful perform snapshot deploy "master pom" to primary repository.
4) Deploy downstream projects
Or, because we are using various (identically configured) slaves to perform the jobs, do we have to install a Hudson dedicated repository and do the following?:
1) Hudson job "master pom": mvn deploy to Hudson repository.
2) Do all downstream jobs
3) If step 2 successful, mvn deploy "master pom" to primary repository.
4) Deploy downstream projects
Dec 12, 2008
Kin Namier says:
Is it possible to have a slave authenticate as a certain user? I've finally got...Is it possible to have a slave authenticate as a certain user? I've finally got a nice master/slave network up and running, but in order to do so, I had to grant read access to the "Anonymous" user on my main Hudson server, which opens up the site to anybody who wants to browse our projects, download build artifacts, etc. I would really prefer to have the anonymous user have no rights at all, and require usernames/passwords for all of our users, but then this would break our slaves...
So, is it possible to tell the slave to log into hudson with a given username and password? I've looked through the documents here on the wiki, the config XML, and tried passing "-help" to all available programs, but I can't seem to find anything.
Oct 13, 2009
Stefan Bäumler says:
I agree this is a defect (still exists in version 1.323); the whole authenticati...I agree this is a defect (still exists in version 1.323); the whole authentication is less worthy when I must grant read access to Anonymous only to enable build slaves connecting to the master. Does there already exist an issue on it, or is it already planned for fixing in any further version?
Sep 29, 2010
Thomas Johnson says:
It looks like slave.jar accepts the following arguments: -jnlpCredentials USER:P...It looks like slave.jar accepts the following arguments: -jnlpCredentials USER:PASSWORD
Providing Overall Read Access to the user seems sufficient to get a slave up and running. This means that you can run the slave with the following command:
java -jar slave.jar -jnlpUrl http://build.example.com/computer/12.34.56.78/slave-agent.jnlp -jnlpCredentials builder:12345If you're running the Windows service, you can tweak the "arguments" node inside c:\hudson\hudson-slave.xml to contain this option. This does leave the issue of securely storing the password in the launch script, but it does achieve the "no rights for anonymous" objective.
Jan 29, 2009
David Multer says:
If you configure a Windows slave using Cygwin for sshd, I recommend not using th...If you configure a Windows slave using Cygwin for sshd, I recommend not using the CVS client that's part of Cygwin. I noticed that it was munging CRLF line terminations on DOS format files. It's possible that some combination of settings could avoid the problem, but I decided to uninstall Cygwin CVS and install TortoiseCVS (with CVSNT) instead. Once CVSNT is added to the PATH, everything worked perfectly.
Apr 22, 2009
Homer Yau says:
Thanks for the writeup on the master/slave setup.Thanks for the writeup on the master/slave setup.
May 22, 2009
Chris Hines says:
I have a working Master (Linux) / Slave (Windows XP) setup. I use the WMI ...I have a working Master (Linux) / Slave (Windows XP) setup. I use the WMI interface to launch the slave.
Getting the master to successfully connect to the slave and launch Hudson remotely required some configuration changes on the Windows machine that were not documented here. The exception message returned during the initial failed attepts provided just enough information for me to find this useful page on the J-Integra site that solved my problem. Maybe others will find this information useful too.
Sep 21, 2009
sheilly agrawal says:
All the posts seem to indicate that the Master has to be a Linux machine. Can Wi...All the posts seem to indicate that the Master has to be a Linux machine. Can Windows (Windows XP) be used as a Master? I wanted to have windows XP as master that would trigger builds on slaves (windows machines again). Is that possible? Where can I find detail instructions on this one?
Oct 26, 2009
Saniya Chopra says:
M using Linux master and Windows slave. I used JNLP to launch the slave on ...M using Linux master and Windows slave. I used JNLP to launch the slave on Windows. When I used the services.msc command this opened services window and from there I tried to start HudsonSlave service and there it gave the following error
Could not start the HudsonSlave service on Local Computer
Error1053: The service did not respond to the start or control request in a timely fashion.
Plz help me solve this error
Jan 21, 2010
prashanth says:
When I use the windows 2003 Master and windows 2003 slave I am having a issue wh...When I use the windows 2003 Master and windows 2003 slave I am having a issue while running the build on slave, it fails with (Fatal : Unable to build script). I think it is adding an extra slashes \, how to eradicate this without this it is not working. Started by user anonymous
Building remotely on testing1
[DIMENSIONS] Removing 'file:/C:/Hudson/workspace/PROJ1_DEV1/'...
[DIMENSIONS] Checking out project "Test_DEV:PROJ1_DEV1"...
[DIMENSIONS] (Note: Dimensions command output was -
[DIMENSIONS] SUCCESS: Operation completed
[DIMENSIONS] Operation completed
[DIMENSIONS] )
[DIMENSIONS] Dimensions project was successfully locked
FATAL: Unable to find build script at C:\Hudson\workspace\PROJ1_DEV1\\proj1\build\build.xml
[DIMENSIONS] Dimensions project was successfully unlocked
Finished: FAILURE
Apr 14, 2010
Bobbi Newman says:
I am using Hudson to launch a distributed build for automated test purposes.&nbs...I am using Hudson to launch a distributed build for automated test purposes. The master Hudson server is a Linux box; the slave, which only runs this set of tests, is a Windows box. We use ant on the Windows box to launch the automated tests. The slave is set up using JNLP and autologon. The process works fine; the only question is that sometimes in the error logs I will find that ant returned a non-zero exit code from the automated test process (the exit codes are for the automated tests, for example, informing us that exceptions occurred during the automated tests) on the slave machine. This is printed in the console output for the build, but the build still succeeds. I'd like to be able to use that exit code to send descriptive email, but I can't seem to find any way to access that information. Am I missing something?
May 07, 2010
Sven Oppermann says:
Are there plans to add another distribution build like: 1. the hudson master wi...Are there plans to add another distribution build like:
1. the hudson master will get the source code changes from the SCM
2. rsync this with a slave node
3. finally builds it on the slave node
Im asking because, i'm using Synergy as SCM and a Synergy Project can only exists once. So if a build runs initially on node 1 and next time on node 2, synergy is moving the complete project to node 2.
May 17, 2010
meenu says:
I am working on distributed builds from hudson. One window NT node is ...I am working on distributed builds from hudson. One window NT node is being used as slave. I am trying to perform build on mounted drive on window node. But hudson does not understand the drive names as say U:\ or any other name. But it recognises the absolute path.
My requierement is to use the network drive with out giving absolute path.
Please guide if we can perform this functionality from hudson
Aug 19, 2010
Alex Lea says:
We managed to get this working by adding a subst command to the build subst U: ...We managed to get this working by adding a subst command to the build
subst U: C:\mydir
The service should then be able to see the U: drive. You may also need to configure the Hudson slave to run as a process with network access rights (not sure whether this is strictly necessary).
Jun 28, 2010
Norbas says:
Lets assume that we have Master M and slave A, B and C. How can we execute on d...Lets assume that we have Master M and slave A, B and C.
How can we execute on demand the Job Z, only in one of the slaves?
For example: I want to run Job Z on Slave C. and later I want to run Job Z in Slave B.
Cheers
Jan 13, 2011
arya ahmadi-ardakani says:
goto Hudson, Select the job Z, click on configure. at the job configure page cli...goto Hudson, Select the job Z, click on configure. at the job configure page click on "Restrict where this project can be run" and in the "label expression" input box enter the name of the slave (ex: salve A). and then save this job.
create the same job (job Z: job Z-1, Z-2) for other slave (ex: slave B and C) however set the Restrict where this project can be run" to slaves B and C.
I think you get the idea from here
Jan 14, 2011
arya ahmadi-ardakani says:
I was wondering once the build is done (using an script that calls gcc and make ...I was wondering once the build is done (using an script that calls gcc and make and all sorts of thing) on an slave, how can I use those files to package them?
example: I have 3 builds that are being run on 3 different salves (Slaves A, B, C) and the resultant of all three needs to be packaged as one. how does master handle this situation?
Please let me know.
Thannk you
May 13, 2011
Yang Chen says:
I setup one Windows slave for one Windows master, they are two physical machines...I setup one Windows slave for one Windows master, they are two physical machines. The "Remote FS root" is set to "E:\hudson". I want to create one job named TEST to run a bat script aaa.bat on the slave. The content of aaa.bat is:
cd F:\project
call bbb.bat
The bbb.bat is in F:\project. But when I run the job, hudson try to find and run the bbb.bat in E:\hudson\TEST. The "cd F:\project" is not used. If I added "echo %cd%" after "cd F:\project" to print the current directory, it prints something like "E:\hudson\TEST". Can anybody tell me how to resolve this? I want the "cd" command takes effect. Thanks.
May 15, 2011
Yang Chen says:
I fixed the issue, just type "F:" first or put the script in the same driver wit...I fixed the issue, just type "F:" first or put the script in the same driver with Hudson workspace.
Jul 07, 2011
Laurent Tardif says:
I got a "funny" error starting a slave on Windows XP, the service failed immedia...I got a "funny" error starting a slave on Windows XP, the service failed immediately, without logging anything. When launching the slave from the command line, everything works fine (with same user as the service, ...).
After investigation, it appear the "event logs" was full. The service was failing with : can not add logs ... system is full ....
So, i clean the log, and the service works again.
TIP : running the client on the command line with the "test" parameter may help :)
Mar 21, 2012
Akilan Paulraj says:
i have my master in one server (hudson installation), slave insalled in another ...i have my master in one server (hudson installation), slave insalled in another server. unable to run build. its looking CVNT in global installation. (path is different from slave),
how to set CVSNT path for local machine?
"java.io.IOException: Cannot run program "D:\cvsnt\cvs.exe" (in directory" - errror
kindly help me i am struggling for one week
Regards
Akilan