Development/Testing

From Soar Wiki

Jump to: navigation, search

Contents

About Testing

The intention is to provide a testing "checklist" to ensure coverage and consistency without going all the way to fully automated testing (which can be a lot of work to set up and maintain). Thus, tests will consist of steps to be done by hand, with an English description of the expected results which can reasonably be verified without resorting to doing a diff (which may be oversensitive anyway). This is not to say that no parts of the testing are automated -- indeed, many tests consist of running a test program and checking the output.

There are actually multiple checklists. The Quick Tests are those which should be done by a developer after making any significant changes before checking the code in. The intention is that these take no more than 30-45 minutes to run through. The Deep Tests, on the other hand, test a broader range of more specific functionality. The Deep Tests list is further split into subcategories, so that a developer who changes functionality specific to some category might also want to run that part of the Deep Tests.

The checklist has several columns:

  1. Test column: links to the steps for performing the test. Test names are prefixed with T: to prevent possible conflicts with other pages. The tests are numbered to make it easier to reference test ranges (e.g. I'll to tests 1-5, you do tests 6-10).
  2. Windows, Linux, OS X: Lists the last Subversion revision number for each platform for which the test was run and whether it passed or not. If a test fails, it should be strongly emphasized so it stands out.
  3. Comment: Any comment someone wants to make about the test. If a test failed, it might provide details. Please add the Wiki signature whether or not you make a comment.

Example:

Test Windows Linux OS X Comments
1 T:SomeTest 123:PASSED 102:FAILED  ?:? Failed on Redhat Linux because can't find library; passes on Gentoo --Bob 14:59, 31 Jan 2006 (EST)

This means that T:SomeTest was last tested on Windows under revision 123 (and it passed), was last tested on Linux under revision 102 (and it failed), and was never run on OS X.

Unless otherwise noted, all tests are assumed to be run on Debug builds (so memory leak testing can be done). Release build tests are welcome, and should be noted in the comments when done.

Quick Tests

Test Windows Linux OS X Comments
Q1 T:TestConnectionSML 6563:PASSED 5131:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q2 T:TestClientSML Simple 6563:PASSED 5131:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q3 T:TestClientSML Remote 6563:PASSED 5131:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q4 T:TestMultiAgent 6563:PASSED 5131:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q5 T:TestJavaSML Simple 6563:PASSED 5131:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q6 T:TestJavaSML Remote 6563:PASSED 5131:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q7 T:TestTclSML 6563:PASSED 5131:PASSED  ?:? --Bob 16:41, 8 May 2006 (EDT)
Q8 T:TestPythonSML 6563:PASSED  ?:?  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q9 T:TestCsharpSML 6563:PASSED  ?:?  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q10 T:SoarJavaDebugger Simple 6563:PASSED 5157:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q11 T:JavaTOH Simple 6563:PASSED 5157:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q12 T:JavaMissionaries Simple 6563:PASSED 5157:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q13 T:towersSML Simple 6563:PASSED 5131:PASSED  ?:? --Bob 10:25, 14 Feb 2007 (EST)
Q14 T:towersSML Remote 6563:PASSED 5131:FAILED  ?:? Takes too long, so listener closes socket (see bug 765). --Bob 16:41, 8 May 2006 (EDT)
Q15 T:TestSoarPerformance 6563:PASSED 5131:PASSED  ?:? --Bob 16:41, 8 May 2006 (EDT)
Q16 T:TestSMLPerformance 6563:PASSED  ?:?  ?:? Second test is really slow on Vista. --Bob 10:25, 14 Feb 2007 (EST)

Deep Tests

Run Tests

These test the run command. To save space, some tests are actually multiple tests in one -- for example, a test might test running 0, 1 and 10 decision cycles under certain conditions. If a test fails under some subset of these "subtests" it should be noted in the comments. Furthermore, some tests can be run under single and multiple agent cases. There are separate tables for each kind of agent case.

Single Agent These are the tests with just one agent.

Test Windows Linux OS X Comments
R1 T:Run by Decision 5143:PASSED 5150:PASSED  ?:? --Karen 13:26, 9 May 2006 (EDT)
R2 T:Run by Decision to StopPoint 5143:PASSED 5150:PASSED  ?:? --Bob 16:44, 9 May 2006 (EDT)
R3 T:Run by Decision from StopPoint 5144:PASSED 5150:PASSED  ?:?
R4 T:Run by Elaboration 5144:PASSED 5150:PASSED  ?:?
R5 T:Run by Phase 5144:PASSED 5150:PASSED  ?:?
R6 T:Run Til Output 5144:PASSED 5157:PASSED  ?:?

Multiple Agents These are the tests with at least 2 agents (but feel free to test more). All agents are run together.

Test Windows Linux OS X Comments
R1 T:Run by Decision 5144:PASSED 5157:PASSED  ?:? --Bob 14:49, 10 May 2006 (EDT)
R2 T:Run by Decision to StopPoint 5144:PASSED 5157:PASSED  ?:?
R3 T:Run by Decision from StopPoint 5144:PASSED 5157:PASSED  ?:?
R4 T:Run by Elaboration 5144:PASSED 5157:PASSED  ?:?
R5 T:Run by Phase 5144:PASSED 5157:PASSED  ?:?
R6 T:Run Til Output 5144:PASSED 5157:PASSED  ?:? the first agent to exceed the stack depth limit will generate an interrupt, so not all agents will get to same point --Karen 14:31, 9 May 2006 (EDT)

Multiple Agents, Run --self These are the tests with at least 2 agents (but feel free to test more). Only one agent is run (feel free to vary which one).

Test Windows Linux OS X Comments
R1 T:Run by Decision 5144:PASSED 5157:PASSED  ?:?
R2 T:Run by Decision to StopPoint 5144:PASSED 5157:PASSED  ?:?
R3 T:Run by Decision from StopPoint 5144:PASSED 5157:PASSED  ?:?
R4 T:Run by Elaboration 5144:PASSED 5157:PASSED  ?:?
R5 T:Run by Phase 5144:PASSED 5157:PASSED  ?:?
R6 T:Run Til Output 5144:PASSED 5157:PASSED  ?:?

Miscellaneous Tests

Test Windows Linux OS X Comments
M1: T:Java5Run  ?:?  ?:?  ?:?
M2: T:Java5Build  ?:?  ?:?  ?:?
M3: T:VS2005Build  ?:? N/A N/A
M4: T:StaticBuild 5155:PASSED  ?:?  ?:? --Bob 09:01, 10 May 2006 (EDT)

Regression Tests

These test old bugs that have been fixed.

Bug Tests

These test existing bugs that have not yet been fixed.

Test Windows Linux OS X Comments
Bug 606 T:Bug606 3400:FAILED  ?:?  ?:? Output-link command is marked status complete too soon. See bug 606 for details. --Bob 15:17, 31 Jan 2006 (EST)