My Research Journal

Blog Tools
Edit your Blog
Build a Blog
RSS Feed
View Profile

«	July 2025					»

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Entries by Topic
All topics
Analytical Frameworks
Certificates
Data Analysis «
Data Collection
Factor Analysis
Interviews
Learning and Study Styles
Literature Review
LP Software
Main Study
Meetings
Methodology
PhD Skills
Pilot Study
Questionnaire
Research questions
Seminars
Thesis writing
Third Party Monitoring

Some URLs
Main Home Page

My Research Journal

Wednesday, 7 November 2007

Data analysis
Mood:

chillin'
Now Playing: Incessant noise of the Jennie Lee Building
Topic: Data Analysis

Well, I've done some preliminary data analysis with the 22 participants I have, there seems to be two outliers (J&E) which seems to be influencing the data unduly so I may have to remove them in the final analysis - but we shall see what happens.

With the Latin square design, thankfully it seems as if the order does not make a difference to how the problems are answered, however, problem 2 is being done significantly better by students than problem 1 and 3. There is no clear indication that any of the boxes are actually helping the students unfortunately - although I am still hoping something will happen in the next 14 data collection which would show something ... so far the things that I'm looking out for it seems that the students are doing better in the constructive problems using the black-box than the open and glass box ... but this is only in the marks they're receiving - I still have to see what is happening in terms of self-explanations ... hopefully there is something there which might be able to help. Further, in high conceptual stuff, the glass box seems to be doing the worse, but in general everyone is doing worse with this stuff, so it might be as Doug's say a glass-bottom effect. I think the reason why students are getting the high conceptual parts wrong in the LP is because they're drawing from real life experience which is interesting as this usually helps people in getting the answer correctly in most problems but as James' says it is because they're not seeing or formulating it as a linear programming problem in their minds.

However, I think for the high conceptual it is dependent on part on some low conceptual in the interpretive questions and the procedural in the constructive questions, probably should look when they get the 1st part correct, do they go on and do the high conceptual part right ... which I think might be more interesting.

Hopefully, when I come to the self-explanations I can see something happening which might explain the differences between the boxes (hopefully!)

Posted by prejudice at 11:48 AM GMT

Share This Post

Post Comment | View Comments (1) | Permalink

Friday, 16 February 2007

Maths-Computing analysis
Mood:

don't ask
Now Playing: The Campfire Has Gone Out (Don Edwards)
Topic: Data Analysis

Well ... still getting pretty frustrated looking for tasks that can measure conceptual knowledge, as most of them seem to love linking multiple representations particularly graphs!! Well, never mind that and this never-ending literature that seems to love high-school and primary school examples if I see one more example on division and fractions ... I swear I'm going to scream.

Anyway, since multiple representations (as in CAS) are linked to conceptual knowledge and this is sometimes called meaningful learning and that is linked to deep approach, I got to thinking why not check and see if the deep approach scores were different for the three types of software (I did do this analysis but had no idea what it really meant at the time) and sure enough, the CAS came up higher for the Deep Approach Score and lower for the Surface Approach Score. I'm guessing perhaps it has to do something with multiple representations - or perhaps no relationship at all!

Posted by prejudice at 11:50 AM GMT

Share This Post

Post Comment | Permalink

Monday, 5 February 2007

Some random ideas
Mood:

rushed
Now Playing: Jeene Ke Ishaare (Phir Milenge)
Topic: Data Analysis

Well, I've got to write up my remote observation study but doesn't look possible to get it done soon. Anyway, I was thinking as I go through and code, I can code them according to theoretical, rational, scholar, experimenter and thinker by Trouche (2000) for each of the students and see if it differs depending on which software they're using. I think based on my French translation (think I may have missed out some important bits), this is what the 5 things mean:

Theoretical: Uses the references (notes, paper); works towards interpretation for understanding, uses analogies for proof, time spent on the calculator as a whole is medium and time spent for calculating a particular aspect is high.

Rational: Uses pen/paper, uses inferences for understanding, uses demonstrations for proofs, spend limited time overall on the calculator and also spends limited time for each action it does on the calculator

Thinker: Uses the calculator for its information, investigates for trying to understand, accumulates (?) knowledge for understanding, time spent on the calculator as a whole is high, but time spent on the calculator for each action is limited.

Experimenter: Uses all devices for information (notes, pen/paper and calculator), compares of understanding, confronts for trying to do proofs, spends about a medium time in all on the calculator, and uses a high amount of time on the calculator

Scholar: Doesn't use source of the above devices for information, investigates for understanding, gets stuck or copies when doing a proof, uses the calculator a medium amount of time, high amount of time doing the actions on the calculator

Posted by prejudice at 10:38 AM GMT

Share This Post

Post Comment | View Comments (1) | Permalink

Monday, 22 January 2007

Back at the OU/ PME paper/ pilot study analysis
Mood:

a-ok
Now Playing: La Bamba (Los Lobos)
Topic: Data Analysis

I'm back at the OU trying to get into the swing of things. I still have no clue about my main research study ... and I need to firm that up quickly, because think I need to get that started by March.

Whilst I was on holiday I did some analysis of the videos since I was writing a PME paper - not sure how well that went but got it in. I listened to all the audios for the students when they were doing the practice question since I thought perhaps that is where the initial interaction with the software is occurring and the learning. First of all found for two of the participants (J and Cl) who had some mathematics background and had both started with the grey-box both looked to see how the expected values was being calculated. The rest of them didn't look at grey-box very long.

I think for the people who didn't know any of the maths, they only started to tried to understand how to calculate the expected value when they were doing the white-box (eg. R, Ch, G, Cl). I'm not too sure if there were much understanding such as conceptual but rather more procedural. I think this might be reflected in most of them not getting the conceptual problems right except Ch. Not sure if that is the configuration of the (BB, GB or WB) or something that is intrinsic to him although R also questioned about understanding expected value and had similar configuration (BB, WB and GB). Think my data is too little to make any good conclusions.

Posted by prejudice at 11:47 AM GMT

Share This Post

Post Comment | Permalink

Wednesday, 30 August 2006

Pilot Study yet again!
Mood:

a-ok
Now Playing: Teahouse on the Tracks (Donald Fagen) - where do they get these titles from??
Topic: Data Analysis

So, back to my pilot study - still have no clue what my data is saying - got too much factors that's all I could figure ... getting more confused by the moment - at first thought to do covariate analysis but that has got me even more confused - I think I'm going to go simple and build from that.

Wish me luck :D.

Posted by prejudice at 11:05 AM BST

Share This Post

Post Comment | Permalink

Friday, 25 August 2006

Feeling a bit on the slump
Mood:

lazy
Now Playing: Another Pot O' Tea (Anne Murray)
Topic: Data Analysis

I don't think I've made any real progress with my work thus far ... when it comes to my data analysis, John as suggested I ignore the non-normality (well not ignore exactly!) - and find literature that suggest that ANOVA's are robust against the non-normality and heterogeneity of variance assumptions (he suggested I read Box's paper) ... my problem - I'm not sure if you can actually mix both - so, what I'm going to do is pretend there is not a large violation of either, and continue with ANOVA's but might use Brown & Forsythe's F instead for the heterogeneity - since there are unequal sample size - oh well still deciding.

Anyway, got to decide quick since got to send in a report on this by next Friday!

Posted by prejudice at 9:39 AM BST

Share This Post

Post Comment | Permalink

Friday, 18 August 2006

Pilot Study: Data Analysis
Mood:

irritated
Now Playing: Melody of You (Sixpence None the Richer)
Topic: Data Analysis

So, I have the data for my pilot study using the SPQ and the maths-computing inventory. I've begun to try an analyse it but seem to be hitting brick walls everytime I try!

Well, the first brickwall was when I realised that I had grouped Environmental Sciences and Biology together in the questionnaire, this meant I had a Hard-Applied-Life and a Hard-Pure-Life discipline in the same grouping - and I was stuck with how to separate out the students belonging to the Env. Sci. and Biology. Thankfully I had the students intended degree they wanted to complete from when they entered the OU and was able to pick out somewhat those who were from the Env. Sci.

The next problem (well more my fault!) - I had started doing analysis when I realised that for the maths-computing inventory I had negative-marked questions and I had forgot to account for those when I had calculated the scores for the scales - so had to go back and do those and then start back the analysis.

Well, after doing several ANOVAs etc. (again my fault!) - I then decided to check and see whether my data was normal and had homogeneous variances between the groups (the disciplines, gender and software) - well, whadya know ... they weren't!!! All the scores I had calculated from both the SPQ and the Maths-Computing Inventory were all non-normal (Kolmogorov-Smirnov normality test) - that did got me frustrated because then tried to see if I can transform it into being normal (log, square, squareroot, arcsin, reciprocal - and various combinations of those) - but no go ... as the kurtosis and skewness were both less than 1, I guess that is why there were no improvement by transforming. Anyway, then proceeded to trim the samples, by taking out the outliers (well only did it for the surface approach score - haven't really tried the others) - and that didn't improve it one bit - same sort of results .... so, gave up on that and accepted that my data was non-normal.

Then came the other brickwall ... test of homogeneity of variance (Levene's test) - well some groups were homogeneous whilst others weren't - only gender was homogeneous between all scores. The rest were at least heterogeneous for one variable, for example in the Hard/Soft discipline group it was heterogeneous for the mathematics motivation scale. Well, I know I can apply Kruskal-Wallis Test for the non-normal data but has to have homogeneous variance - so have done that for gender - which I got most everything being not significant except computer confidence. What is interesting, was I wanted to check and see if that was influence by age (age was related to computer confidence as well - using spearman rank correlation coefficient) - now in normal parametric statistics you can control that variable (perhaps as a covariate) - but in nonparametric statistics wasn't certain, so decided to employ spearman rank correlation to test the correlation between age and computer confidence by controlling gender (i.e. just for males and then for females) - to see what will happen ... and there seems to be some correlation between age and computer confidence when gender is controlled (well for females there is a correlation - but for males no).

Anyway, I've moved away from the point, I now have data that is non-normal and heterogeneous variances - and don't know what tests I can use in SPSS for comparing across groups - I have looked up the stuff on the internet and it seems that there is a new statistic called MOM-H statistic which uses Wilcox and Keselman MOM (Modified One-Step M-estimator) for deciding how much should be trimmed and then combining this with Schrader and Hettmansperger's H statistic (all of this developed by Othman et al, 2004) can be used for non-normal and heterogeneous variances - still got to look up the paper ... but my problem is that I'm wondering if I even want to venture into that realm of deep statistics - I mean, I'm not sure how important these pilot study results will be to my main study ... should I go through doing all that data analysis for nothing?? Can my simple descriptive statistics suffice? I've got to decide, because it will take me awhile to decipher what all the symbols mean etc. in those papers and how it relates to my data before I can even begin to use it and that might take up more time than I really have.

Posted by prejudice at 12:03 PM BST

Share This Post

Post Comment | Permalink

Thursday, 30 June 2005

Some things to consider when analysing and writing up
Mood:

a-ok
Now Playing: Can't Think Straight - Gilbert O'Sullivan
Topic: Data Analysis
Well these are little juttings I made whilst doing some things - so what to make sure I keep them in consideration:

(i) I have to look up some plausible reasons for using a 5 point Likert scale versus a 7 point scale like Albritton et al. Hmm ... can't find any good papers to say why I shouldn't except one that says I should sort of use the 7 point - but they were actually trying to support their use of the 7 point scale (some people Wyrwich and Tandino). However, I could indicate that since the analysis was intended for chi-square and there were so few intended respondents, there was a higher likelihood of having empty cells and hence the reason to making it a 5 point scale. Also, we used 'fence sitting' to ensure a response and minimize the middle point as there is a tendency to use the middle point more in a 5 point scale.

(ii) Now we have a low response rate in the email questionnaire. Doug was saying that this is always true in comparison with a paper questionnaire but got to check this out. The problem with comparing with known data is that most people knew the list they were sending too - that the questionnaire had direct relevance to them - there was some uncertainty about relevance regarding my list. Anyway, found some literature that says the average response for email questionnaires is in the 20s to the 30s %. So, we'll see.

(iii) It might be also useful to compare level of courses with responses that I received for the ATI to check if there is any difference in the way the courses are delivered

(iv) I also have to consider what I would have done differently if I had to do it all over again (simple send it during the course term!!!)

(v) I'll also have to compare the distribution of the responses to the list I have to see if the absent responses were random - and how the distribution is towards discipline.

(vi) Persons might have seen the stuff on simplex algorithm, graphical solution etc. and thought that is what I meant by coverage and may have said they didn't cover the solution - not sure if that will influence the outcome.

Posted by prejudice at 1:55 PM BST

Share This Post

Post Comment | Permalink

Tuesday, 21 June 2005

Meeting with John about data analysis
Mood:

a-ok
Now Playing: Surfin' USA (The Beach Boys)
Topic: Data Analysis
So, had my meeting with John this afternoon and James came along to the meeting as well. Well, I discussed with John my idea of converting the disciplines into soft and hard, pure and applied, life and non-life. He indicated that they were doing something similar in the the SOMUL project and that the person's work they were using was Tony Becher for decided the disciplines (he only told me this since I was using Biglan's categorization) - I think it might be similar or Becher using Biglan's work for his classification. I remember the name Tony Becher though because I have a paper by Neumann et al which Becher co-authored concerning disciplines.

Anyway, James have some reservations of using this classification since he doesn't think a lecturer think of themselves as being applied or pure at all -that name only comes because the department chooses to use the name applied in their title. Well ... to some extent I think that is true - but that doesn't prevent the shaping of course to reflect its applied nature and teachers conforming to that situation. I think there is literature (can't remember where) that indicates that lecturers do change their position of lecturing a course depending on its department or discipline (or something like that) - I think that was in the Lindblom-Ylanne et al paper.

Well, I think John is pretty much sold on the idea of the disciplines - so he decided to give me some ammunition to combat James opposition by recommended that I speak to a Yann Lebeau who worked with him in the SOMUL project to help provide me with literature concerning the choice of how subjects are placed into which kind of discipline (i.e. soft or hard etc).

Well ... with respect to the logistic regression ... John wasn't too certain what is the best road to take since he haven't done logistic regressions in quite awhile. He suggested for those questions that have a small number of answers with 'not sure' to treat it as missing data (such as for the coverage and the delivery questions) and then I can treat the other 4 options as ordinal and hence can use ordinal logistic regressions. He, however, cautions I must check and see what the assumptions of logistic regressions are upheld and even if it isn't and I go ahead and use it to remember to include in my discussion that I am treating this data as being normally distributed etc. when it is know it is and the results may not be quite correct.

Further, he suggested that I look up some non-parametric tests to check and see if I can use this for testing my data - he mentioned some kind of non-parametric one-way ANOVA which I can't remember. Oh I've found it on the net - its called the Kruskal-Wallis One Way ANOVA . However, the problem with this is that it can only deal with factor I believe ... wait let me check my facts ... oh no that is not true since you can have a two way Kruskal-Wallis, well at least according to that website. However, I think it becomes more complicated than that - John did suggest using multivariate analysis ... but that will assume that the scales were continuous - now John said for him the ATI they used (Intentions and Beliefs), that they were using underlying continuous scale and hence the values were continuous. I'm not sure if I am up to that point to believe it is continuous so will stick to it being ordinal at this point. In that case the ordinal logistic regression (well at least for the LP part)should be best ... I can't remember - but I have a feeling that the logistic regressions are non-parametric test since it uses chi-square values - if I remember carefully. Well, it does use Chi-square - so it is indeed non-parametric in nature and there is no need to uphold the assumptions of normality etc. But one website indicates that I should have at least 50 cases for each independent variable - sheesh - I don't have that - so not sure what I am going to do in that case. However, there is a suggestion to use discriminant analysis instead but I think in that case I will lose the ordinality - plus need to be normally distributed and equal variances etc. Anyway, will have to examine normality and if it upholds might go with discriminant analysis because according to the website it is more powerful - but I'm more comfortable with logistic regression ... we'll see what I use.

Anyway, John also suggested in the case of ATI where persons left out about two or three questions we can assume it is not sure (but in order to this - you must do a missing value analysis and then decide where the cut off point is!). John also suggested that I can combine the values for Intentions and Beliefs and do a two part multivariate analysis (or was it called a double multivariate analysis) - since he said essentially they were the same thing (well got to base this on literature since he was basing his talk on that Norton et al, didn't find any difference between them).

Further, in particular with respect to the questions on the delivery simplex algorithm etc. - John suggested (since I have the 'not sure' and the 'not taught' options and which might well be answered alot and cannot be treated as missing values) to combine the values and try to get an ICT variable - i.e. the amount that ICT is used in different disciplines.

Posted by prejudice at 3:03 PM BST

Updated: Tuesday, 21 June 2005 3:46 PM BST

Share This Post

Post Comment | Permalink

Monday, 20 June 2005

Not sure about my data analysis!
Mood:

quizzical
Now Playing: Time Marches on (Tracey Lawrence)
Topic: Data Analysis
I'm looking at my analysis of my data for my LP section. So, what I did, I got Biglan (1973) paper (both of them) and decided to categorized my disciplines according to his methodology that is into hard vs soft; pure vs applied; and life systems vs non-life system. Well, I had some problems in decided which category the disciplines fitted into. For example computer science I felt it should be hard, applied and non-life system. But is computer science considered a pure or applied subject. I wasn't certain about that. Anyway, he had some categories of disciplines in a table so follow that to some extent. What I did too, was that I went to the course websites and see what it is about and what department they were in to fit it in better if I was unsure - such as things that had built and natural environments (just to tell you - I decided those were life systems).

Well, after I did ... decided to look at how the responses were distributed for the coverage and delivery for formulation, solution and sensitivity analysis. Well, I decided to do ANOVAs but something just wasn't going so right with it - because you know this is ordinal and nominal data rather than continuous. So, looked at a bit how to do analysis with that - at first tried some loglinear analysis - wasn't sure what results I got ... well, finally decided to do logistic regression - which seems to be the way to go. I'm doing ordinal logistic regression using the PLUM module in SPSS and it seems to be working out to some extent ... except I've discovered one problem with my logic ... my ordinal variables (i.e. my responses to delivery and formulation) may not be truly ordinal since I have the 'not sure' variable at '5' ... so, I was wondering what to do with that!

Well ... I thought maybe I should make it a missing variable (I've given the value of 999 for my missing variables) - well, I am seriously considering doing that - well, I meet John tomorrow so we'll see what he says.

Posted by prejudice at 5:18 PM BST

Share This Post

Post Comment | Permalink

Newer | Latest | Older