“Just go ahead and state your name.”

“My name is Carolyn Huie Hofstetter.  Should I spell that?”

“Spell Hofstetter.”

“Hofstetter H O F as in Frank S as in Sam T E T T E R and I am an Assistant Professor in the Graduate School of Education at the University of California Berkley.”

“Great.  Alright.  What is evaluation?”

“Well uh it has (pause) it depends on who you ask but in general evaluation is the collection of s..systematic data collection.  I’m sorry.  Can I restate that?  Uh in general evaluation is the collection of data to make some sort of um judgment about the merit or worth of something and it usually refers to uh judgments about programs, policies, personnel, projects or curricula something like that.  That’s the very general definition of evaluation but it has very broad um has a very broad range depending upon who you ask.”

“What is assessment?”

“Um well assessment as I would define it is uh also systematic data collection to I think measure uh (pause) probably characteristics of individual probably focused more on like psychological traits or behaviors um like aptitude or achievement.  Uh more affective type behaviors like uh self-esteem.  (Pause)  That sort of thing.  It focuses more on individuals I believe than en..than evaluation does.”

“K.  What would you like educators to understand about evaluation processes in contrast to assessment processes?”

“Hmm.  That’s a very broad question.  (Deep breath)  Well (pause) evaluation is a broader term I think or has broader use than assessment.  Uh it can include assessments where generally evalu..assessments do not include evaluations.  I think there’s a higher arpical order here.  Um when and it just depends on you know what sorts of information you want to collect.   You know what’s the purpose of the evaluation what’s the purpose of the assessment and they have very different goals uh depending upon what you are looking for.  So for evaluation you might be looking at what is the effectiveness of a particular program or intervention for increasing the achievement of children and you might use an assessment tool or several assessment tools to actually uh provide measures of the level of achievement for the kids.  So I think there’s you know it’s evaluation uh probably uses assessments but it’s not the other way around.”

“K.  What contextual factors effect math performance?”

“(Laughs)  Um well my research uh tends to focus more on English language learners and the assessment of them uh and so I focus more on contextual factors that influence uh math performance among that population.   So there are several factors that will influence their performance.  Um the obvious one is socioeconomic status which effects any sort of test uh performance.  There’s also uh English language proficiency is a very big one for this population.  Typically students are given a content-based test math or science in the English language and obviously the students language proficiency is a major confound for their test performance.  You may not be assessing their math performance you’re really assessing their English language proficiency.  Uh other variables or other uh uh factors that might influence their performance are their language of instruction.  Typically you don’t want to administer a test to a student that is that is in a language different from what their instructed in.  So its really not appropriate to administer uh an a test in English to a student that is instructed in Spanish for example.  Although frequently policymakers and I think people in general feel that it..administering a Spanish language or native language test to an English language learner is appropriate to get a more valid indication of the students’ abilities when in fact that may not be true.  It really depends a lot on the language of instruction and other background variables.  Um other things to consider are uh the number of years that the student has been in the United States.  Uh what type of what how much instruction the student has had in the United States in a particular language uh be it in English or some other language.  Uh sometimes people like to consider the students’ home language and how often you know uh how often I guess their native language is used in various contexts like the home and with friends and various social contexts outside of school.  I think those are the main ones (pause) but uh several several uh factors can influence the test performance of the kids and in fact those focus only on the student themselves.  I mean there are others that will that are more test related you know like the uh the amount of uh English language or the words the number of words in a particular test item.  Uh if you’re measuring if your interested in measuring a students math ability but you have for example a math problem that is embedded within a lot of words which is very typical uh in you know s..standardized assessments now particularly you know selected or I’m sorry constructive response uh items then you know there’s there’s can be such a huge language load that you’re really starting to confound the students’ language proficiency with their math performance.”

“What accommodation should teachers make to ensure an equable teaching situation for English language learners?”

“That is the (laughs) that is the question we all want to find the answer to and it’s uh right now I hate to say it but there are no definitive research studies.  Um the accommodations research is in its infancy.  Uh and it’s also probably one of the messiest areas of research around.  Uh I was actually in a session at NCME The Nation Council for Measurement in Education a conference that’s being held right now and uh the discussant who is a very well known figure in the testing field stated that this was probably the most difficult area of research that he has encountered.  And uh having been in this area for about three years I no actually longer than that probably about five years um I would have to agree.  The reason that the level of research is so limited right now uh is because it’s very hard to collect data with this particular population.  There are just lots of confounding influences that will just make the information that you obtain questionable.  So uh I don’t know how much background information to provide about accommodations if that’s (pause) of interest to you or if I should just…”

“No, go ahead.”

“Ok um well the reason that we have this notion of accommodations is because of a legislative mandate.  Uh it’s it was the reauthorization actually it was several standards based on acountabality..accountability based reforms that were passed in the nineteen nineties uh that focused on the inclusion of all children in various educational opportunities to increase um their uh ab..opportunity to achieve high standards uh to be represented in every educational opportunity and to be in..included basically.  (Pause)  In order for this population the English language learners to be included which had historically not been the case there is notion of accommodations came about in the policy circles and the accommodations generally referred to a modification to the uh test itself or to the test administration process that is provided in order to yield a more valid indication of what the students know and can do especially in a particular content area.  And there are various types of accommodations.  Uh extra time is most commonly used.  Uh there are several others like uh uh administering the test in small groups or uh individually.  There are others that are more oriented towards changing the test itself like administering the test in a native language students’ native language or changing the linguistic difficulty of some of the words in the item.  Not the technical words but sort of the non-content related words that might be difficult for an English language learner to understand.  And by introducing these different types of accommodations there are concerns that they may change the construct of the test or the content of the test or the difficulty of the test and the reason that we have accommodations is to in some way level the playing field so to speak uh for English language learners with more English with a with English fluent students uh to effectively take into consideration their level of English proficiency and accommodate that in such a way that we can obtain you know more valid scores for their kids for their uh for those kids.  Um well the research right now (pause) uh is really in its infancy.  We are trying to look at different types of accommodations but we’re finding that uh it’s a really difficult population to work with.  There’s a reason why there’s not a lot of research on just the assessment of English language learners to begin with.  Uh in the past people have just discounted standardized test results for this population because of all of the confounding influences.  And so to actually collect data with this population is problematic because we’re trying to figure out how to administer um test items in an standard formats as well as accommodated formats and you administer it you know to a single student uh one for and then the other or do you randomly assign for example students within a classroom to one of the (pause) two different uh to the accommodated and standard versions or do you assign whole classes of English language learners to that accommodation or another?  Um and because the type of information that we have about this population is is pretty messy uh we’re having some difficulties.  The reason that students the English language learners would be able to participate in this study or I’m sorry.  Let me repeat that.  In order for English language learners to receive an accommodation they have to be designated as what’s called Limited English Proficient or LEP.  And this is a federal designation that’s used in you know uh for for federal funding.  Uh however even though students are designated as LEP there are no common operational guidelines for designating students as Limited English Proficient.  So it’s completely conceivable for schools and districts to be using very different criteria for uh giving students this categorization.  And it’s that categorization that is prerequisite to them actually receiving accommodations.  So you’ve got that difference right there and it’s it’s uh it’s difficult uh in terms of the accommodations themselves.  Um teachers like to use extra time.  They feel that uh English learners really need the extra time to decode
the words in a test.  Um although the research ta date shows that ex..extra time really helps everybody.  Uh and it helps some students more than others.  So the students that uh tend to have higher ability levels are more likely to benefit from having extra time than students with low ability levels for example.  So if you give a student that doesn’t know the content of the test you could give them you know three days and he’s still not going to do any better.  Um and that also introduces some equity issues too that you know how do you decide whether this student gets an accommodation and that student doesn’t.  This one gets one and that one doesn’t.  Um is it fair to give a student designated as Limited English Proficient an accommodation and not an English fluent student?  You know because obviously the purpose of the tests that or depending upon the purpose of the test um you would want to find out what the kids know and can do in general and you would provide whatever conditions are necessary to make that determination.  So that’s part of the reason why this is so messy and there are other types of accommodations that we’ve looked at.  Um there’s actually a (pause) growing group of researchers around the country that are trying to look uh uh at issues in this area and the general consensuses is that we don’t know what accommodations work best for English language learners.  Um there are some that look promising.  The modification of linguistically complex words in a test item uh has potential because so far that seems to be the one test accommodation that narrows the gap between Engl..between English language learners versus non English language learners.  Um (pause) other types of accommodations are um like dictionaries.  Sometimes you would want to administer a dictionary because they’re already available in the classroom and kids are generally familiar with using them.  Uh so that’s an advantage.  But the dictionaries could vary depending upon which one you’re using.  They might provide certain types of information that are actually part of the test item that you don’t want the children to have access to.  Um so there are various issues to consider.  I mean cer..certainly whenever you administer an accommodation you want to make sure that the kids already have some familiarity with it.  Um that they are already used to having a test in Spanish for example and if they’re already in..being instructed in English and you give them a test in Spanish they’ll be very confused.  Um if you give them a dictionary and they’ve never used a dictionary they’ll be confused.  Uh so there’s an opportunity to learn issue that’s included.  So because there are so many different um conditions you know classroom conditions for these students uh they all have access to different types of information.  They’re all at varied levels of English language proficiency.  Um some may be designated as Limited English Proficient uh in one school but not in another school so it depends on you know where you happen to be in school at.  Uh how do you decide whether a student should get an accommodation to begin with and once you decide that a student is eligible for one then what type of accommodation should it be?  So these are the sorts of issues that we’re trying to unpack but we don’t really have any definitive information to share (pause) at this point.  I can go into detail about some of the smaller studies that are available but uh basically everybody will tell you that they’re yielding mixed results and we don’t know.  (Laughs)  The problem though is that school districts and states unfortunately have to include English learners in their large-scale assessments by mandate.  I mean their policy for their funding depends on it.  And so there’s you know legislative mandates.  There are also civil rights acts that require the students to be able to participate otherwise it’s considered discriminatory for them not to be included.  But how to actually do that in a fair and equitable way is um unclear at this point.”

“Ok.  That’s great.  Um what is the relationship between literacy knowledge and power (Ms. Hofstetter laughs) in mainstream culture in the United States?”

“Hmm.  Another very good question.  (Laughs)  Um well there seems to be a strong relationship between literacy knowledge and power.  Um and this was based this determination which seems very logical uh was (pause) basically the focus of a study that I conducted with uh two other colleagues Tom Stench and Richard Hofstetter and we wanted to test the maxim (pause) of knowledge is power and uh and we found that you know literacy is generally measured by the types of literacy practices that people will engage in you know how much they read books magazines um newspapers.  And that helps increase the amount of knowledge that they have in their heads you know what goes into their long term memory and is permanent.  And the long term memory and it’s information that’s in there helps to uh helps people you know sort of filter through new information that’s coming about so that in a sense their knowledge base is increasing based on you know just a larger um (pause) just more information to begin with and typically more efficient information processing skills.  (Pause)  So the expectation is that if you read more than you probably will have more knowledge and it constantly builds on itself.  And people who have more knowledge tend to access and are able to pursue more uh I guess cultural and political opportunities than people who are less literate less knowledgeable.  And we found that this was true even after we controlled for uh certain other potentially confounding factors like level of education (pause) and age and gender.  And it seems very common sensical it’s it’s it’s uh it’s a notion that we’ve would all uh believe is true but I guess our study you know actually tested out the hypothesis and actually had measures of literacy and measures of knowledge and indicators of power as it was defined in various uh I guess political science circles.  We we really looked at power within a political context you know in terms of uh whether people feel efficacious about their um behaviors you know whether they feel that their vote makes a difference in the world.  Um and people who are more literate tend to feel that you know more comfortable engaging in those sorts of discussions.  They’ll talk about political issues with their friends.  They will take um hard stances give firm opinions on you know some particular topic of interest.  Uh they may be more likely to participate in you know political events (pause) protests that sort of thing.  Um so I think you know there there seems to be a very strong relation between the three areas and uh I guess the general moral of the story is that you want to read as much as you can to ultimately you know increase your status in life (pause) and reading will never hurt you.  (Laughs)”

“What are responsibility do teachers have to teaching their language minority students the culture of the United States?”

“(Deep breath) I don’t know if I have an answer to that question.  Um (pause) ya there were two questions there that uh I honestly had a little bit of difficulty with because they’re less related to testing and I’m sure you could find other people that are (laughs while saying rest of sentence) much more informed and able to give you a a halfway cogent answer.  So I is it ok if I pass on that one?”

“You bet.”


“What was the other question?”

“I think it was the one following that.”

“The one after about what’s the relationship between a person’s reading ability and their content knowledge?”

“(Pause) Well actually I I I I will try and answer that one.  What?  The relationship between…”

“A person’s reading ability…”
“Person’s reading ability…”

“…and their content knowledge.”

“…and their content knowledge.  Um well at least coming from a measurement standpoint because I’m interested in students’ test performance um there’s a strong relationship.  Students who have higher reading abilities can obviously understand uh more of what’s going on in a classroom and on a test than kids who don’t by virtue of just (pause) the amount of reading that we encounter in everyday life.  So at least in a testing context um for example you may have word prob..or math uh content and science content that’s embedded in words and the better that you can read those words and understand them comprehend them then the better off you are in terms of actually being able to understand the the question that’s being asked of you and um and be able to answer it.  So I think that emphasizes again the importance of somebody’s you know uh language proficiency (pause) and the influence it has on their I think performance in general but particularly in tests.  And since we’re living in a standards-based accountability-based world right now that’s so focused on testing that seems to be that’s a big issue.”

“Ok.  What implications does this have for literacy development of ESL students?”

“(Deep breath) Wow.  Um well we have to figure out what is the best way to help ESL students English as a Second Language students or English language learners.  We have to find ways that will best assist their development of reading and language um listening speaking skills et cetera and I think that leads us to the question of what are the best instructional practices in order to do that?  And um I think that is where we can go into an issue of for example bilingual education.  Now because there is a lot of research that’s fairly mixed in that area and I’m actually conducting an evaluation of a bilingual education program to see how well it increases a students’ English language proficiency as well as their performance in different content areas like math and language arts as compared to students that are receiving instruction completely in English.  So like an English emersion program.  And um it seems to be it’s a big issue right now partic..ularly in California uh because we don’t have (pause) well actually we have legislation now that uh prohibits bilingual education programs uh unless parents sign a waiver for their children to participate in the programs.  But it’s really a political uh politically difficult issue to deal with right now at least in our state where we have a very large number of English language learners about one in four students in our state.  I think forty percent of the English language learner population in the whole country is in California.  So a lot of stakes.  Um but you know it depending on who you ask you know what are the best ways for a child to learn you know the language?  Some will say they should achieve mastery in their native language in order for them to uh apply that knowledge those experiences to uh learning a new language.  Um other people will say you just need to immerse them in context and let them figure it out on their own.  And some of them are going to be ok.  I mean many people (pause) are immersed in another culture and they manage to survive and they come out.  It’s very difficult at first but they they come out alright.  And then there are others that don’t.  And I think those are the one that we’re most concerned with.  Uh so at this point it’s really hard to tell what is the best way to instruct the students.”

“What concerns do you have regarding the validity of assessments for English language learners?”

“(Pause) Um I have several concerns about the validity of assessments for English language learners.  Uh it depends on what sorts of assessments you are talking about but particularly um the standardized uh outcome based assessments that have uh potentially high stakes decisions attached to them are very problematic for the general student population but I think especially for English language learners because there’s just so much um so much concern about what the tests are actually measuring.  Um (pause) um (pause) could you read the question to me again please?”

“Sure.  What concerns do you have regarding the validity of assessments for English language learners and what can improve if any?”

“K.  (Pause) Well the concerns right now are that um the (pause) we’re not sure what the standardized tests are measuring for English language learners if they are ad..administered in English that’s been done in this area is fairly limited uh because the population has so many other factors um that characterize it that can confound the results.  You know it’s a very mobile population.  There are a lot of uh um I think familial and social factors that will influence how fluent a child is.  Um (pause) and if you know if the student is not fluent in the English language then you don’t know exactly what’s being measured when you give that student a test in math.  Uh you may not be getting an accurate measure of that student’s math ability however math is defined but at least you know in this day and age it’s defined uh and measured largely in terms of word problems.  So I think those sorts of uh questions will (pause) or those sorts of issues will question the validity of the assessments for this population.  Um and I think another concern is the purposes that assessments are developed for.  There are assessments that are developed for one purpose and their used for other purposes and that’s really a problem because that’s not a problem with the test that’s a problem with the use of the test.  And I think people find that hard to or th..they don’t quite realize that distinction.  So you may have a perfectly good test that is designed to give um a general indicator of a student’s knowledge in a particular language at one point in time (pause) which is frequently the case for large scale assessments um standardized assessments and try to use that information to uh as a measure of what the student actually knows and can do in the classroom and typically it’s not possible for one test measure to give you an accurate portrayal of what a child knows and can do.  But that’s sometimes the level of value that’s placed on these tests and that that is a problem.  Uh so I think one of the  biggest concerns is that we just need to remember the purpose that the test was designed for and to make sure that that’s aligned with the use of the test.  Um another concern about the validity of standardized assessments at least is that they are typically designed and normed with a more homogenous uh traditional population which is typically more affluent uh more Caucasian um than you know this very diverse heterogeneous uh culturally and linguistically varied population and because this group has typically not been con..not been included in the norming samples of the standardized assessments then you don’t really know um what if if it’s really appropriate to compare to make judgments about this population based on a test when they weren’t included in the norming sample to begin with.  So I think that that’s something that is slowly changing because now the testing companies realize by mandate uh because of this legislation that just passed you know a few years ago everybody is realizing the importance of including English language learners in all facets of the educational system and especially so in the testing (pause) and assessment systems because of all of the stakes that are being attached to these test scores.  So now the testing companies I’m sure are including you know higher numbers of the students in their in their uh test development uh procedures.  I’m sure that they are thinking about uh how test items are worded so that they are not more linguistically complex than they need to be.  Uh I think it would be good for all test items (laughs) to be actually be as clear as possible and as simple as possible.  Um at least in terms of the language (pause) that would just be a good idea for testing in general.  Um so there’s that and then there’s one other thing I wanted to mention too.  And uh oh well there are different types of assessments I mean obviously and I’ve been focusing more on the uh outcome based standardized assessments.  But there are also new developments in what we would call formative assessments which are actually looking at how a student performs over a long period of time.  It’s multiple measures of (pause) of a child’s knowledge uh within a classroom.  The assessments are you know they could be writing samples.  They could be um they could be you know regular standardized tests.  Uh they could be well various lots of different options but effectively you’re you’re building some sort of portfolio some sort of uh uh (pause) set of multiple pieces of information to make some judgment a more informed holistic judgment about what the kids know.  And it that is really um at least the notion of using multiple measures is something that’s very common in the testing and measurement field.  We know that you cannot make a (pause) really um definitive judgment about a student based on one test score.  But unfortunately its a lot of people are trying to do that and the mantra is that you really need to have multiple pieces of information in order to make any sort of decision particularly a high stakes decision.  And that sort of message is uh becoming more known in the educational and policy circles and and for good reason.  I mean it’s appropriate.  We want to be as fair as possible for the students and uh you know certainly there are faults you know with any sort of uh uh you know judgment that’s made about what a person knows because you’ll never be able to capture every facet of that knowledge.  But you know you can usually try and pick out pieces and they’ll they’ll uh address you know various uh dimensions you know of a student’s ability and it may not be perfect but it’s it’s probably about as much as we have right now and that’s ok.”

“What would you want teachers to bare in mind in the development of their own classroom tests (pause) regarding validity?”

“(Pause) Um in terms of teachers developing their own tests I would want them to think about what it is that they want to measure and why and how are they going to use that information.  Um there are lots of different types of information that one can collect but they may not always be what you want.  So to have those sorts of questions in mind before you develop an assessment and it doesn’t have to be a formal assessment it can be something very informal.  Teachers are assessing their kids all of the time.  Every time they’re watching the child listening to ‘em speak in the classroom they’re they’re collecting this sort of information in their head and they’re the part of the multiple measures.  They may not be uh written down in any way um but frequently they are and that you just want to try and have a collection of these pieces of information uh documented so that you can basically have a case for (pause) making a judgment about this student.”

“Talk on the of importance of stake holders’ involvement in evaluation.”

“Um stake holders are very important in the conduct of evaluation.  Basically stake holders are n..generally defined as people who have a stake in the evaluation of a program um and (pause) I think it’s important to include them for several reasons.  Uh one is that you I think get a more true picture of what a program is like if you include people who know about the program.  Um these could be you know the staff people the clients people that are served by the program the people who are running it the people who designed it.  Uh there are lots of different viewpoints that create (pause) a rr..a program for example.  And that it’s important for an evaluator to be aware of those viewpoints and try and make sure that they’re considered in the evaluation process so that the evaluation itself is something that is fair is honest is ethical and it can be used you know.  That’s another reason why you might want to include stake holders is that if they are involved in the evaluation process they are more likely to be committed to it to believe in it.  Um if they feel that they have the opportunity to provide their input they are more likely to give you correct information (laughs) um and to give you more in depth information that is really what you want to get at in any sort of detailed evaluation study.  Um and I think another reason for including stake holders and why their important is that uh it increases the level of use of the evaluation (pause) um and at various levels.  You know people will use a study for various purposes and to the extent that an evaluator can be aware of all of these different audiences and the different goals that they have for an evaluation the different ways that they might want to use the information um the better off the evaluator is to make some sort of judgment about you know the focus that the the what the actual evaluation will focus on.  Um you know what type of study will yield the most as they say bang for the buck uh because there is you know typically a limited amount of funds a limited amount of time and resources so you want to make sure that whatever data you collect is maximized in some way and that it’s really collected so that it can be useful.  Um and frequently evaluations have in the past not been very useful.  (Laughs)  They they often times will sit on somebody’s shelf and collect dust and they make good door stops too I understand um but this is a new uh trend in evaluation is to include different stake holders people who are invested in the program and the evaluation process.  And I think it’s really better all around.”

“Ok.  What issue or concern more than any other do you see as your personal soapbox issue (Ms. Hofstetter laughs) and then connect your expertise to the education of your students.”

“Um I think my personal soapbox is probably how we use and interpret test results for English language learners.  That uh tests are not necessarily bad things.  They can provide very useful information if they are used appropriately and fairly and the trick for all of us as educators I think is to figure out how to do that.  Um (pause) and it’s pretty difficult with this population because they’re are a lot of like I said before a lot of you know characteristics of the population that uh make some sort of measurement of their ability very difficult.  But I think over time you know if we work through these issues then hopefully we’ll yield better measurement of what they know and can do.”

“Ok.  Thank you very much.  (Ms. Hofstetter takes deep breath) Is there anything else that you’d like to uh add for teachers of language students?”

“Well I don’t know whether its (pause) um…”

“Any advice?  (Ms. Hofstetter laughs) Something about your dissertation or something about the things that you’ve been doing.”

“Advice.  Um well it’s a very exciting area to be working in um because it’s you know a terrific population of students.  I mean one can the reason that you’re probably people are interested in doing research uh or conducting evaluations or programs that serve you know English language learners is because they love the notion of diversity.  Um and there’s just a lot of heterogeneity in this population which is very exciting.  But it’s also very difficult from a research standpoint because it means you have to take into account a lot of other factors that are very hard to take into account.  Um but I think you know if if people are persistent you know we will over time (laughs) probably you know be able to yield some fairly definitive answers to the questions that you’re posing.  Um they’re not easy answers or they’re not easy questions to address um but they have huge policy implications and I think it’s our obligation as researchers and educators to try and you know look at them as as fairly as possible.  So I guess my advice would be you know enjoy the population because they’re great um but if you’re engaging in the research process not to give up.  It’s it’s uh it’s difficult for a reason and I think those two are willing to take into consideration the level of diversity and embrace it.  Um we’ll produce better research but it’s going to be harder research to conduct just for that very reason.  Um so I think that’s basically it.  I’m sure I have many issues that I can (laughs) share with you.  I have may soapboxes but uh that’s probably it for now.  (Laughs)”

“Thanks very much for your time today.”

“Ok.  Well thank you.”