By Nakonia (Niki) Hayes

If a nationally recognized test is given but the sampling method used to select the students who were tested cannot be clearly explained and the names of the schools that participated are protected by laws of confidentiality" should its test results be invalidated for practical use? The answer must be Yes."
In December 2009 the Austin TX school system was applauded for ranking number one for eighth grade math scores and number two for fourth grade scores among 18 American urban districts. These districts had participated in the National Assessment of Educational Progress (NAEP) tests commonly called the nations report card."
It was exciting news even though it showed only 2 out of 5 Austin students were performing proficiently" at their grade level in mathematics. This success" reflected the disastrous state of mathematics education in America. Austin officials cautiously praised the results in the news reports while also expressing concern about the continuing problem of improving math scores among their students.
While the NAEP has a respected reputation of a fairly solid math content assessment scrutiny of the test results as with all tests that require time energy and money taken from classroom instructional time is required. Clearly such scrutiny has become more vital in todays test-driven society.
For one thing theres big money to be made in test-making. The burden must therefore be on buyers to know if chosen assessments really measure what the buyers want to know. Does the test help offer true insight into an individuals knowledge base? Does it instead measure social attitudes and creative responses instead of the inner workings of the subject matter?
Known tests like the SAT and ACT reportedly let students and colleges know if the students are academically prepared for college coursework. Numerous K-12 achievement tests also have track records one of which is the NAEP.
Looming on the horizon is a new national" testing program for K-12 mathematics and reading. It is to accompany Pres. Barack Obamas proposed new federal standards in those two subjects. Called the
Common
Core State Standards Initiative (CCSSI) there is little understanding of how its new tests will impact other tests given by states.
How such an expansive testing program can be scrutinized and utilized for improvement within instructional planning is unknown. Its unbelievable financial cost is also unknown. In the meantime its important for educators to hone their skills on assessment of the processes and results of tests that have an established track record.
It turns out that for Austins NAEP scores scrutiny of the process is not possible. Interestingly a caveat about the use of their test scores is already given by
the NAEP on its website.
They warn about putting too much stock in the scores:
As provided by law NCES (National Center for Education Statistics) upon review of congressionally mandated evaluations of NAEP has determined that achievement levels are to be used on a trial basis and should be interpreted with caution."
Then in the very next sentence they write a rebuttal: The NAEP achievement levels have been widely used by national and state officials."
The obvious question since the NAEP uses a sampling" of students is How are students selected for the testing sample? Or what are the criteria for their selection?"
While there were 1500 fourth grade students from 70 Austin elementary schools and 1300 eighth grade students who represented 20 middle schools the sampling could range from 45 students at one school and 20 at another according to the district testing office.
When asked about the selection process both the Austin districts testing representative and the NAEP state coordinator for the Texas Education

Agency (TEA) explained that each school had a NAEP coordinator and that the building" sent in student names to the state office.
Actually the TEA first sends a list of student names for each school to the NAEP and they select the schools for participation in the test. The building" (school) then compiles names of participating students and forwards those to the NAEP central office.
Since a building" cant send in the names the questions remained Who specifically pulls together the names to be tested? Was it the NAEP coordinator for the building? Are there criteria for the students selection?"
A new answer was that efforts are made to include students from all demographics. For example according to the NAEP report 44 percent of the Austin fourth graders (660) and 29 percent of the eighth graders (377) were either English Language Learners (ELL) and/or students with disabilities (SD).
Five percent of the selected fourth graders and seven percent of the eighth graders were ultimately excluded from taking the test because their academic needs could not be adequately accommodated.
However of all the remaining fourth graders 19 percent (285) were allowed accommodations to help them with the test-taking. For the eighth grade nine percent (117) were allowed accommodations.
Such accommodations for ELL and SD students would have had to be based on those listed in their official Individual Education Plans (IEP). Among many different accommodations these could have included extended time a private setting and/or questions being read to the student.
If a students IEP says he must be allowed to use a calculator for all of his math work for example then that accommodation must be permitted on the test.
This produced a new question: Why are accommodated scores included in the final scores for the district?"
While ELL and SD students deserve the opportunity to be tested with their peers if their tests are taken with accommodations those scores should be calculated in a separate tally. The total results reported in press releases must clearly state that accommodated scoring is in the figures.
Such action is not about excluding special needs students. Its about being able to focus honestly on the academic reality and needs of two distinct groups: those who require accommodations and those who dont use them. To embed accommodated scores is an appreciated gesture of inclusion of special students but it is not helpful for general instructional planning by the overarching school district which by its nature must plan largely for the non-accommodated student.
After several weeks there was still no definitive answer from the district or state offices on how students were selected for the tests. A request was made for the names of three elementary and three middle schools that had participated in the tests.
Perhaps the NAEP coordinators of the schools could answer the question.
The district office response said a call should be made to their legal office. Thereupon the attorneys office replied We are claiming Information Confidential by Law (552.101) under Section 303(c)(3) of the NAEP Act and FERPA (Family Educational Rights and Privacy Act)."
Of course by law any information related to a specific student is confidential including the school he or she attends. However no students

name or identifying characteristics was being requested only six locations (out of 90 schools) where the testing had been done.
This inability to respond to a simple inquiry about the NAEP methodology must raise doubts about the use of the test results since those depend on a process that can notor will notbe explained.
As usual applying data from any assessment for the potential benefit of students will fall to the teachers but explanations of the assessments use and validity to teachers and the public must come from the administrators and district officials.
How can this be done without full knowledge and trust in the processes used understandable delineations among groups and even user-friendly definitions of the academic levels measured on the tests?
For example the data in Austins case show that of the 1300 eighth grade students in the test 39 percent scored at a level of proficiency or above." The 1500 fourth grade Austin students taking the math test scored second only to Charlotte NC for that proficient category with 38 percent. (See
http://alt.coxnewsweb.com/statesman/pdf/12/120909_reportcard.pdf.)
The proficient achievement level according to the NAEP website Represents solid academic performance. Students reaching this level have demonstrated competency over challenging subject matter."
What does challenging subject matter" mean?
Since there is also a separate advanced" level for superior performance the proficient level must represent average or above average. How far above average" does this mean? Shouldnt examples be shown of these categories in press releases or news stories?
Those scoring at the basic" level are considered to have partial mastery of prerequisite knowledge and skills that are fundamental for proficient work at each grade." Partial mastery" must mean these students cannot work consistently at their expected grade level in mathematics because they are missing some of the knowledge and skills they need.
Do the test results help teachers know what those are?
There is a fourth category of students who score below basic."
How far is this below" basicone year behind grade level two years more? (A quarter of Austins eighth graders were in this group.)
Looking at the total eighth grade sample this means 507 were proficient 468 were basic performers and 325 were below basic. Or 793 of 1300 (61 percent) were not working successfully at their grade level.

The Austin newspapers also reported that while 38 percent of the fourth graders were proficient in their scores 83 percent scored at the basic level.
Since thats more than 100 percent an accurate report of the data would be needed for good instructional use.
In conclusion if results are based on a sampling that cant be defined or verified and are reported in a mixture of different academic groups how can it be utilized by a school district effectively?
If it cant be what is the purpose of participating in another costly test no matter how highly rated it has become by the education establishment?