Xã hội học - Measurement of data

Definition of the concept(s) to be measured. Identification of the components of the concept. Specification of a sample of observable and measurable items to represent the components. Selection of the appropriate scales to measure the items. Combination of the items into a composite scale to measure the concept. Administer the scale to a sample and assess respondent understanding. Revise the scale as needed.

32 trang | Chia sẻ: huyhoang44 | Lượt xem: 1131 | Lượt tải: 0

Bạn đang xem trước 20 trang tài liệu Xã hội học - Measurement of data, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

Measurement of DataSamuel K. Frimpong (PhD)Outline of PresentationIntroductionConcepts, Constructs and MeasurementLevels of MeasurementCriteria for Assessing Measurement ScalesPreparing, Inputting and Checking DataExploring DataPresenting DataIntroduction In the social and behavioural sciences, it is not unusual for a researcher to engage participants or respondents in a way that will help him or her to ascertain and describe the respondents feelings, attitudes, opinions, and evaluations in some measurable form. The process of assigning numbers to various attributes of people, objects or concepts is known as measurement and this is our primary concerned in this lecture.Concept and Measurement A concept is a mental abstraction or idea formed by the perception of some phenomena. Examples of concepts in business include job satisfaction, job commitment, brand awareness, brand loyalty, service quality, image, risk, channel conflict, empathy, and so on.Measurement involves assigning numbers to a phenomenon according to certain rules that reflect the characteristics of the phenomenon being measured.Measurement ProcessThe measurement process involves specifying the variables that serve as proxies for the concepts (constructs). A proxy is a variable that represents a single component of a larger concept and, taken together, several proxies (indicator variables) are said to measure a concept.ScaleScaling is the branch of measurement that involves the construction of an instrument that associates qualitative constructs with quantitative metric units. A scale is a measurement tool that can be discrete or continuous. Discrete scales measure only direction, but continuous scales measure both direction and intensity.In many ways, scaling remains one of the most arcane and misunderstood aspects of social research measurement. And, it attempts to do one of the most difficult aspect of research tasks -- measure abstract concepts.Steps in Developing a ScaleDefinition of the concept(s) to be measured.Identification of the components of the concept.Specification of a sample of observable and measurable items to represent the components.Selection of the appropriate scales to measure the items.Combination of the items into a composite scale to measure the concept.Administer the scale to a sample and assess respondent understanding. Revise the scale as needed.Four Levels of MeasurementNominal OrdinalInterval RatioNominal: uses numbers as labels to identify and classify objects, individuals, or events. They identify individuals, job titles or positions, brands stores etc.Not really a ‘scale’ because it does not scale objects along any dimension. It simply labels objectse.g. Are you happy with the services at Hotel X? Yes/NoNominal scale is not limited to two categories. We may measure occupation with a nominal scale using the categories: Teacher [ ]Doctor [ ]Lawyer [ ]Other [ ]Ordinal ScaleOrdinal: is a ranking scale. It places objects into a predetermined category that is rank-ordered according to some criterion. It enables the researcher to determine if an object has more or less of a characteristics than some other object. But, there is no information regarding the differences (intervals) between points on the scalee.g. please rank the following attributes of Hotel X from 1 to 4, with 4 being the most important Food quality [ ]Atmosphere [ ]Prices [ ]Employees [ ] Interval ScaleAn interval scale uses numbers to rate objects or events so that the distances between the numbers are equal. Thus differences between points on the scale can be interpreted and compared meaningfully. An interval scale has all the qualities of nominal and ordinal scales, plus the differences between the scale points is considered to be equale.g. A 10-degree difference on a Fahrenheit scale has the same meaning anywhere along the scaleBut, we can’t say that 80 degrees is twice as hot as 40 degreesRatio ScaleThe ratio scale provides the highest level of measurement.Consider the question: How many people are there in your household? A response of 1 to the question can only be interpreted in one way. Namely only one person in the household. On the other hand, if we compare two responses, i.e. a response of 2 with a response of 4, we can conclude that the numbers in the household are 2 and 4 respectively. Further, we can state that the first household is smaller that the second household by two people. Finally we may compute the ratio, (4/2)=2, and conclude that the second household is twice the firstTypes of ScalesMetric ScalesSummated RatingsNumericalSemantic DifferentialGraphic RatingsNonmetric ScalesCategoricalRank OrderSortingConstant SumPaired ComparisonSummated Rating ScaleThe final score for the respondent on the scale is the sum of their ratings for all of the items (this is why this is sometimes called a "summated" scale). There could be several statements that relates to a single concept, such as opinions about a company or product. When you sum the scales for all the statements, it is referred to as summated rating scale.When you use the scale individually, it is referred to as a Likert ScaleItalian Restaurant has a wide variety of menu choices.Strongly Disagree Neither Agree Agree StronglyDisagree nor Disagree Agree 1 2 3 4 5Numerical ScaleThis scale has a number rather than verbal description as response option. Using a 10-point scale, where 1 is “not at all important” and 10 is “very important” how important is ________ in your decision to do business with a particular vendor?Semantic Differential ScaleIt is another approach to measuring attitudes. It uses 5 or 7 point scales depending on the level of precision desired and the education level of the target population. The distinguishing feature of this is that it uses bipolar end points with the intermediate points typically numbered. The end points are usually chosen to describe individuals, objects or events with opposite adjectives or adverbs.e.g.“My supervisor is . . . . “ Courteous ___ ___ ___ ___ ___ Discourteous Friendly ___ ___ ___ ___ ___ Unfriendly Helpful ___ ___ ___ ___ ___ Unhelpful Supportive ___ ___ ___ ___ ___ Hostile Competent ___ ___ ___ ___ ___ Incompetent Honest ___ ___ ___ ___ ___ Dishonest Enthusiastic ___ ___ ___ ___ ___ UnenthusiasticGraphic RatingIt is one that provides measurement on a continuum in the form of a line with anchors that are numbered and named. The respondent gives their opinion by placing a mark on the line. Sometimes the midpoint is labeled and other times it is not. On a scale from 0 to 10 how would you rate the atmosphere of Samuel’s Greek Cuisine restaurant? Indicate by placing an “X” at the appropriate place on the line. Poor OK Excellent |_______________|_______________| 0 5 10CategoricalIt is a nominally measured opinion scale that has two or more responses categories. When there are more categories, the researcher can be more precise in measuring a particular concept. It is often used to measure categories as age gender or education. How satisfied are you with your current job? [ ] Very Satisfied [ ] Somewhat Satisfied [ ] Neither Satisfied nor Dissatisfied [ ] Somewhat Dissatisfied [ ] Very DissatisfiedHow interested are you in learning more about the benefits that are offered with this health plan? [ ] Very Interested [ ] Somewhat Interested [ ] Not Very InterestedRank Ordere.g. “Please rank the five attributes listed below on a scale from ‘1’ (the most important) to ‘5’ (the least important) in searching for a job.”Job Attributes RankingPayBenefitsCo-workersFlexible Scheduling of Work HoursWorking ConditionsSortingThis type of scaling approach asks respondents to indicate their beliefs or opinions by arranging objects (items) on the basis of perceived similarity, preference, or some other attribute.Constant Sum“Please allocate 100 points across the following four attributes to indicate their relative importance.”Attributes Score On-Time Delivery Price Tracking Capability Invoice Accuracy Sum 100Paired Comparison Below you will find ten pairs of attributes that have been identified a being important when choosing a restaurant. For each pair mark the attribute you feel is more important to you in choosing a restaurant to dine at. PairsAttribute 1Attribute 2Pair 1Food QualityAtmospherePair 2Food QualityPricesPair 3Food QualityServicePair 4Food QualityCleanlinessPair 5AtmospherePricesPair 6AtmosphereServicePair 7AtmosphereCleanlinessPair 8PricesServicePair 9PricesCleanlinessPair 10ServiceCleanlinessCriteria for Assessing Measurement ScaleThese are:Reliability andValidityThe accuracy of measurement scale is associated with validity whilst consistency is associated with the term reliability. A survey instrument is considered reliable if its repeated application results in consistent scores. This is contingent upon the definition of the concept(s) being unchanged from application to application. To be reliable as a scale, the question must be answered by the respondents consistently, in a manner that is highly correlated. Types of Reliability TestsThree types:Test-retest reliabilityAlternative forms reliabilityInternal consistency reliabilityTest-Retest ReliabilityTest-retest reliability is obtained through repeated measurement of the same respondent or group of respondents using the same measurement device and under similar conditions. Results are compared to determine how similar they are. If they are similar, typically measured by a correlation coefficient, we say they have high test-retest reliability.ProblemsThe first time respondents take the test may influence their response the second time they take it. Also situational factors such as how one feels on a particular day may influence how respondents answer the questions and something may change in the time between repeated usage of the test. Alternative Forms ReliabilityAlternative forms reliability can be used to reduce some of the above problems. To assess this type of reliability the researcher develops two equivalent forms of the construct. The same respondents are measured at two different times using equivalent alternative constructs. The measure of reliability is the correlation between the responses to the two versions of the constructInternal Consistency ReliabilityThis type of reliability is used to assess a summated scale where several statements are summed to form a total score for a construct. There are two types of internal reliability tests:Split-half Reliability: to determine the split-half reliability, the researcher divides the scale items in half and correlates the two sets of items. A high correlation between the two halves indicates high reliability.Coefficient of alpha: also referred to as Cronbach’s alpha. To obtain the coefficient of alpha you calculate the average of all possible combinations of split halves. Coefficient alpha ranges from 0 to 1. we use statistical packages to compute the coefficient of alphaValidityValidity is the extent to which a construct measures what it is suppose to measure. A construct with perfect validity contains no measurement error. An easy measure of validity would be to compare observed measurement with the true measurement. The problem is we very seldom know the true measure. To assess measurement validity we use one or more of the following approaches:Content validityConstruct validityCriterion validity Face validity Content ValidityContent validity is based on the extent to which a measurement reflects the specific intended domain of content.Content validity is illustrated using the following examples: researchers aim to study mathematical learning and create a survey to test for mathematical skills. If these researchers only tested for multiplication and then drew conclusions from that survey, their study would not show content validity because it excludes other mathematical functions. Construct ValidityIt seeks agreement between a theoretical concept and a specific measuring device or procedure. The theory is used to explain why the scale works and how the results of its application can be interpreted. To assess construct validity, two checks are performed. Convergent validity is the extent to which the construct is positively correlated with other measures of the same construct. Discriminant validity is the extent to which the construct does not correlate with other measures that are different from it. Criterion ValidityAssesses whether a construct performs as expected relative to other variables identified as meaningful criteria. For example, theory suggests employees who are highly committed to a company would exhibit high job satisfaction. Thus, correlations between measures of employee commitment and job satisfaction should be positive and significant. If this is so, then we have established criterion validity for our construct.To establish criterion validity, we need to show that the scores obtained from the application of the scale being validated are able to predict scores obtained on a theoretically identified dependent variable, referred to as criterion variable. One or both of two types of criterion validity checks can be performed. These checks are referred to as concurrent and predictive validityFace ValidityThis is concerned with how a measure or procedure appears. Does it seem like a reasonable way to gain the information the researchers are attempting to obtain? Does it seem well designed? Does it seem as though it will work reliably? Unlike content validity, face validity does not depend on established theories for support.

Các file đính kèm theo tài liệu này:

lecture_7_measurement_of_data_3363.ppt