Bill Peck | 8 Jun 2009 18:14
Picon
Favicon

Re: The Scheduler piece

Marian Csontos wrote:
> Hi Bill, it's nice and simple.
>
> Just few thoughts:
>
> Multihost tests using rare resource blocking several systems in Assigned 
> state.
> Reserving particular known system by engineer - we should see engineers 
> as form of rare resources as well...
>
> Both can be simply solved by dynamic calculating RecipeSet's priority as 
> a function of job's priority and "something else":
>
> Suggested solution #1: More specific jobs (e.g. where particular HW is 
> requested - x86_64 with > 4GB RAM and ... - where constraints results in 
> single system or just 2-N), will be prioritised over general jobs (e.g. 
> any x86_64) - add size of set of available systems as argument to 
> priority function.
>
> Suggested solution #2: RecipeSet with assigned systems will have higher 
> priority - add number of blocking systems as argument to priority function.
>   
> Suggested solution #3: Assign rarer resources first and more common only 
> after the lock on rarer was obtained.
> This would require changes in scheduler. Combine with #2 for better results.
>   

I like number 1 and 2.  Number 3 sounds interesting but I'm not sure how 
much work it would be to determine what a rare resource is.

> -- Marian
>
> Bill Peck wrote:
>   
>> Hello Everyone,
>>
>> There is plenty more work to do on the inventory piece of beaker but I 
>> wanted to share some of my thoughts on the scheduler to let everyone 
>> know how I think things will work. 
>>
>> First a rough idea on how all these pieces will fit together:
>>
>> Inventory Piece
>> -------------------------
>> - Contains systems!
>> - Keeps track what systems can install or more precisely what things a 
>> system can't install (ie: this box doesn't install Fedora8 but anything 
>> newer is good)
>> - Keeps track of what options are needed for installing said family 
>> (Fedora 9 on this box needs noapic or nousbstorage)
>> - Keeps track of who has access to what.  (This system is shared but 
>> only to other people in the desktop group)
>> - Keeps track of who is currently using a system.
>> - Keeps a log of what was actions were performed (power cycled, 
>> provisioned with Fedora-rawhide-20090601)
>> - Keeps a log of what config values were changed on the system (memory 
>> was increased from 4096 to 8000)
>>
>> Scheduler Piece
>> -------------------------
>> - Contains jobs!
>> - Jobs are just a container of related recipeSets.
>> - RecipeSets hold a collection of recipes that need to run at the same 
>> time (multi-host)
>> - Recipes have a collection of tests that you want to run along with the 
>> following:
>>    - distroRequires: Requirements that allow the scheduler to pick a 
>> distro (I want the latest i386 Fedora-rawhide)
>>    - hostRequires: Requirements that allow the scheduler to pick a 
>> system (This is first filtered by picking the distro, we only get 
>> systems that are capable of installing that distro, Second we could have 
>> requirements like Processor = Intel, memory greater than 4 gig, etc..)
>> - Tests are the actual test to run plus test parameters that you passed in.
>>
>> Lab Controller Piece
>> ---------------------------
>> - This is used only by Inventory.
>> - Cobbler is the heart of this piece
>> - Usually you have one of these per physical location (PXE tends to work 
>> best local)
>> - Cobbler imports the distros and we tell Inventory about what we have.
>>
>> The Harness
>> ---------------------------
>> - This piece is still under design.
>> - Whatever this is it will be in charge of running the chosen tests on 
>> the chosen system.
>> - The results should go back to the scheduler.  But they may also go to 
>> a Test Case Management system.
>>
>>
>>
>> Ok,  Now that the basics are out of the way I'm going to brain dump how 
>> I think the scheduler should work.  The current scheduler we use does 
>> not scale well at all.  The design I'm planning to implement will only 
>> filter recipe requirements once and then loop on queued recipes and free 
>> systems.  If no systems are free then we won't see any of the queued 
>> recipes.  It should scale very well.   There is quite a bit of 
>> python/sqlalchemy code below.  Its mostly pseudo code, but the ideas 
>> should be sound.
>>
>> States
>> --------
>> New            <- Recipes start in this state
>> Queued         <- After initial filtering happens (what systems match)
>> Assigned       <- The Recipe has been assigned a machine
>> Running        <- The Recipe is actually running on the system
>> Completed      <- Were done.
>> InComplete     <- Were done but didn't finish.
>>
>> The big change and the piece that will make this all work is I'm going 
>> to use a cache table.  Its really a mapping table between systems and 
>> recipes.
>>
>> queue_cache Table
>>    system_id, recipe_id
>>
>> Recipes go in as New
>>
>> We'll have the following 4 threads on the Server
>>
>> New_process:
>>  for recipe in 
>> Recipe.query().filter(Recipe.status==TestStatus.by_name(u'New')):
>>      # Figure out the distro requested.
>>      distro = Distro.process_requires(requires)
>>      if not distro:
>>          # No distro matches so abort the whole recipeSet.
>>          recipe.recipeset.action_abort('No distro matches for recipe %s' 
>> % recipe.id)
>>          break
>>      # Filter systems based on selected distro + recipe requirements.
>>      for system in distro.systems.process_requires(requires):
>>          # Don't add the same host twice to the same recipeSet. A 
>> machine can't be in two places at once.
>>          for peer_recipe in recipe.recipeset.recipes:
>>              if system in peer_recipe.systems:
>>                  break
>>          else:
>>              # populate queue_cache table
>>              recipe.possible_systems.append(system)
>>      if recipe.possible_systems:
>>          # There should only ever be one thread/process moving recipes from
>>          # New to Queued.
>>          recipe.status = TestStatus.by_name(u'Queued')
>>      else:
>>          # Can't schedule, abort the whole recipeSet.
>>          recipe.recipeset.action_abort('No systems match for recipe %s' 
>> % recipe.id)
>>     
>> Queued_process:
>>   # Get a list of all recipes that are queued and have systems that are 
>> free.
>>   for recipe in 
>> Recipe.query().join('status').join('possible_systems').filter(and_(Recipe.status==TestStatus.by_name(u'Queued'),System.user==None)):
>>      #Pick the first free system
>>      system = recipe.free_systems.first()
>>      if system:
>>          # Atomic operation to put recipe in Assigned state
>>          if session.connection(Recipe).execute(recipe_table.update(
>>                and_(recipe_table.c.id==recipe.id,
>>                     
>> recipe_table.c.status_id==TestStatus.by_name(u'Queued').id),
>>                     
>> status_id=TestStatus.by_name(u'Assigned').id).rowcount == 1:
>>              # Atomic operation to reserve the system
>>              if session.connection(System).execute(system_table.update(
>>                  and(system_table.c.id=system.id,
>>                      system_table.c.user_id==None)),
>>                      
>> user_id=recipe.recipeset.job.owner.user_id).rowcount != 1:
>>                  # The system was taken from underneath us.  Put recipe
>>                  # back into queued state and try again.
>>                  recipe.status = TestStatus.by_name(u'Queued')
>>          else:
>>              pass
>>              #Some other thread beat us. skip this recipe now.
>>              # Depending on scheduler load it should be safe to run multiple
>>              # Queued processes..  Also, systems that we don't directly
>>              # control, for example, systems at a remote location that can
>>              # pull jobs but not have any pushed onto them.  These systems
>>              # could take a recipe and put it in running state. Not sure how
>>              # to deal with multi-host jobs at remote locations.  May 
>> need to
>>              # enforce single recipes for remote execution.
>>
>> Assigned_process:
>>    for recipeSet in 
>> RecipeSet.query().filter(Recipe.status==TestStatus.by_name(u'Assigned'):
>>       # All recipes in a recipeSet will be in Assigned state.
>>       # Figure out every recipes role in the recipeSet
>>       recipeSet.schedule()
>>       # Clear recipe.systems = []
>>
>> Abort_process:
>>     # A recipe could initially have systems that match but if those systems
>>     # are removed or marked broken then a recipe could stay in the queued
>>     # state foreever.  This will clear them out.  We could add some date
>>     # criteria if we don't want recipes to abort right away.  Maybe a
>>     # replacement system will come back online shortly?  At least the user
>>     # will be able to see that no systems are listed and understand why 
>> their
>>     # recipe is not running.
>>     for recipe in 
>> Recipe.query().filter(Recipe.status==TestStatus.by_name(u'Queued')):
>>         if not recipe.systems:
>>             recipe.recipeset.action_abort('No systems match for recipe 
>> %s' % recipe.id)
>>
>> Adding a new host:
>>    for recipe in 
>> Recipes.query().filter(Recipe.status==TestStatus.by_name(u'Queued')):
>>        # Do I match for added system?
>>        # Was I already added to recipeSet for this recipe? If not continue
>>            recipe.possible_systems.append(system)
>>
>> Removing a host:
>>     # Clear any potential recipes from this system.
>>     system.recipes = []
>>
>> Changing any of the following would force a remove/add on a host:
>>    Change owner.
>>    Loan the system to someone.
>>    Change the groups.
>>    Change to Key/Values or to details, basically any change that would 
>> affect scheduling.
>>
>> Any thoughts? Suggestions?  I'll be branching current beaker into a 0.4 
>> branch so that I can make bug fix releases if need be.
>>
>> _______________________________________________
>> Beaker-devel mailing list
>> Beaker-devel <at> lists.fedorahosted.org
>> https://fedorahosted.org/mailman/listinfo/beaker-devel
>>   
>>     
>
> _______________________________________________
> Beaker-devel mailing list
> Beaker-devel <at> lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/beaker-devel
>   


Gmane