Saturday, December 13, 2008

Final exam grading notes

I've just finished a pass through all the final exams. Overall, I'm pleased.

I expected the first question to be pretty easy. You can't put everyone on the same floor (within the 30 meter "asymptote" mentioned by Herbsleb), but you can try to arrange offices to make the most critical interactions as frequent as possible, and you can do some other things to create opportunities for interaction. The cafeteria on the first floor was a big hint, and many of you took it. One of the interesting ideas (which I didn't think of, but more than one of you did) was having staggered lunch times to try to create specific groupings of people (either within a team, or across teams) to talk. Another idea was having big meetings in the main conference room just before lunch. A couple of you also had some good specific ideas about how to encourage people to move from floor to floor, e.g., by having different snack foods or other facilities available on each floor. There were also good ideas about using the smaller conference rooms on each floor in ways that encourage informal meetings, such as giving each team a space in a conference room in which they could keep up material on walls and whiteboards. Some of you also mentioned that, since the time in which the evidence Herbsleb relies on was gathered, "virtual" interactions like IM have become better and more widely used. I liked these ideas that were specific to the circumstances of this company (number of teams and employees, facilities on each floor) and creative in the ways you addressed the problems we have discussed.

The second question drew directly from some in-class discussions, and a lot of you did recall and adapt that, with discussions of root-cause analysis and adjusting the process to avoid certain classes of errors as well as detecting them more effectively. Some of the answers discussed looking at the history to assess maintenance of old code, but didn't really address how to use the information for new systems. A lot did address using it for new systems, but were pretty light on discussing how ... which shouldn't surprise me, because we didn't talk much about how error reports are triaged and classified to make that possible. Several answers underestimated the difficulty of searching for relevant bug reports (similar symptoms don't necessarily indicate similar causes), but I didn't hold that against you.

I expected the third question to be the hardest, and I think it was. Some of the answers were pretty generic and didn't address the peculiar challenges and opportunities in this specific problem (final hardware available only late in development, and a brand new user interface which will depend on the functionality of that hardware). I dropped a big hint in saying that the functionality was already available, but in a larger form factor, and a few (but not many) of you took advantage of that in the way I was hinting at: We can work with full functioning prototypes that just don't fit in a watch. Maybe we can hang part of the hardware on a user's belt, maybe we need someone running along beside the user carrying some of the hardware, but we can do some pretty good usability testing using the hardware that is available early. While this specific approach was not mentioned much, there were other pretty reasonable tactics, like building the software with a bare minimum of functionality and based on a bar minimum (and hopefully stable) specification of what the hardware could deliver.

An interesting note on that last question is that many answers either suggested or assumed that it had a much larger than usual development team (often the whole software division), and one made a (pretty good) argument that it should be done by a smaller than usual team.

Friday, December 5, 2008

Final exam: 10:15 Tuesday

Our final exam is schedule from 10:15 Tuesday until noon. It will be in our regular classroom.

You can expect something roughly like the midterm in terms of approach. Instead of one question, expect 3-5 somewhat shorter questions (with shorter answers). Material that may be covered includes the two papers we read, plus anything we discussed in class.

You can bring any kind of reference material you want to the exam --- books, notes, whatever, in printed or electronic form. The only thing off limits is web searches.

You may take the exam entirely on paper (bring some paper to write on, and something to write with), or you may type your answer on a laptop computer.

If you use a computer, I prefer a plain text file. (If it is significantly faster for you to use a word processor, go ahead, but produce a PDF for me as well as the original word processor file.) The name of the file you produce should be your family name followed by the class number. For example, if I were taking the exam, I would turn in a file called "YoungCIS422.txt".

I will try to remember to bring one or more USB keys to gather electronic copies of exams. In case I forget, it wouldn't be a bad idea if you had one handy too.

Monday, December 1, 2008

Rehearse/discuss Monday, present Wednesday, in Deschutes

Reminder: We decided to use today (Monday) to prepare presentations, and we will fit all of our presentations into Wednesday ... it will be a little tight, but nice to have them all together. We will meet both days in the colloquium room on the second floor of Deschutes hall.

Tuesday, November 25, 2008

OSB data in simple text format

I merged and tweaked a couple of example programs from the open source "shapelib" project for reading ESRI shape files, and Anthony ran the program over the OSB data to produce the text file in
http://www.cs.uoregon.edu/classes/08F/cis422/data/OSBShpExtract.txt

This is not in xml, but it's a very simple format that can be parsed with either a scripting language like Python, Perl, or Awk, or with Java's "tokenize" method --- each line can be identified by the first token on the line, and the important fields (x and y coordinates of points, in particular) are separated by blanks.

Of course no one is obligated to do anything with this data at such a late date, but there it is if you want to take a shot at it. Later (probably during holiday break) I'll see about making a version of the extractor program that produces XML in a form close enough to the campus map XML input files to "fool" the data readers you have produced.

Monday, November 24, 2008

This week: No lecture, meet in Deschutes room 100

Just a reminder: This week we will not have lectures. I suggest you use our scheduled lecture time for a group "code party" in room 100 Deschutes, integrating and trouble-shooting and filling in anything that still needs doing. I'll be there.

Friday, November 21, 2008

Data, again

Many thanks to Daniel, who apparently knows the magical incantations necessary to transform ESRI shapefile data into XML files. If you look in
http://www.cs.uoregon.edu/classes/08F/cis422/data/OSBxml/
you will find the full collection as osb_gis_xml.tar.zip or you can look in the osb_gis_xml directory to get individual xml files.

It's very late, and I understand if you are not able to incorporate this into your project before the deadline. On the other hand, it would be great if at least some of you could do so, or at least give it a try and document what problems need to be solved to make it work.

Update: I've looked through the XML data and I'm not so sure we're making progress. As near as I can tell, what we have is "PolygonB" objects that are represented as byte arrays (and I don't know how to interpret those byte arrays). It seems that "PolygonN" objects have a field called "Rings", which are arrays of (x,y) coordinate pairs, and that's what we were expecting. I don't know how to turn "PolygonB" objects into "PolygonN" objects. Maybe someone does?

Wednesday, November 19, 2008

OSB data? Please have a look

I have an XML file from Amy. It's one big XML file, rather than a file per layer, but maybe that's not a problem (??). As I eyeball it, though, it seems to me like the shapes are defined in a different way than in the campus map XML files ... it looks like there is a sort of hex encoding of sequences of coordinates, instead of XML-ish encoding of each coordinate pair. So, my best guess is that this will not work with the "generalized" input modules that you have worked on. But please give it a try, and let me know what you discover. Is there some obvious way to transform this XML into something we can deal with?

I also have the raw ESRI shapefile data, if anyone knows the ArcGIS program well enough to manage the XML export from that. (I don't, but I am beginning to see that it's something I had better learn.)

The XML file is at: http://www.cs.uoregon.edu/classes/08F/cis422/data/OSBdata.xml

Midterm grading progress

I've made a first pass through all the midterms now, and assigned tentative scores ... but I want to make a second pass through to make sure your scores don't depend too much on how warm my coffee was or what time of day it was when I read yours. I will probably need to "bin" them (group into rough equivalence classes given the same score) to avoid too much random variation in scores.

Generally I'm pretty pleased. Most of the midterms did draw from both papers, and thoughtfully applied them to the specific project descriptions. A few of the midterms were particularly creative or insightful. Some were a bit generic, describing only general goals ("encourage open communication") rather than concrete plans for meeting those goals. Many, I thought, underestimated the difficulty of precisely defining interfaces and reaching a common understanding of what Jackson calls "ground terms". That's understandable, and I think it's hard to grasp how slippery terms and interface definitions can be until you've had some (possibly unpleasant) experience working with others across organizational boundaries.

Wednesday, November 5, 2008

Midterm

There is just one question on the midterm, but it asks you to draw from both of the papers we read and discussed. (You can also draw from your project experience, but be sure to draw from the papers as well.)

The midterm is due at 5pm Wednesday, November 12. Turn it in by sending your answer as plain text in the body of an email message, using this link:
Turnin email

Your answer should be 500 words or less. (The question is 331 words, so the upper limit is about 1.5 times as long as the question.)

Here is the question:

Development project teams may be distributed between organizations for a variety of reasons. One of the reasons is to combine domain expertise from different organizations participating in a collaborative project. Here are two hypothetical* examples:

a) Nike and Apple are jointly developing "smart shoe" systems for runners. Sensors in the Nike shoes will interact with iPod music players designed and manufactured by Apple. The adaptor between the shoes and the music players will be jointly designed and implemented by a team composed of a shoe sensor design sub-team at Nike headquarters in Portland; a web interface team at Apple's headquarters in Sunnyvale; and a data acquisition sub-team based in Apple's Portland offices. Each sub-team includes experts in one part of the system.

b) Electrical Geodesics, Inc. is a small Eugene company that produces dense-array electroencephalograph (dEEG) devices for measuring brain activity. These dEEG devices can be used together with functional magnetic resonance imaging (fMRI) devices to obtain more complete data about brain functioning. Combining and visualizing the large data sets produced by these devices requires advanced image-processing algorithms running on parallel computers. A project team to build a prototype dEEG+fMRI combined visualization system is composed of a dEEG expert from EGI, an expert in neuroanatomy and an expert in fMRI imaging from the UO psychology department, a computer science professor, and three graduate students, one from psychology and two from computer science.

Pick either one of these examples for your answer. Suppose you have been asked to help organize and manage that example project. How will you apply lessons from the papers we read? Please draw at least some from both papers. Don't just repeat advice from those papers --- explain how you would apply it to the specific example you chose.


* Parts of these examples are pure imagination, but both are (as they say in the movies) "inspired by a true story". See:
http://www.businessweek.com/technology/content/may2006/tc20060523_569911.htm
http://www.egi.com/research-division-converging-neurotechnologies
I have no idea about the actual composition and location of the development teams.

Wednesday, October 29, 2008

Reading: Jackson

The second paper I will ask you to read concerns requirements analysis and specification:

Jackson, M. 1995. The world and the machine. In Proceedings of the 17th international Conference on Software Engineering (Seattle, Washington, United States, April 24 - 28, 1995). ICSE '95. ACM, New York, NY, 283-292. DOI= http://doi.acm.org/10.1145/225014.225041

This is among the most lucid accounts I know of what it means to write a specification. The writing is light-hearted (the paragraph about steering mechanisms for cars busts me up laughing), but it's a very serious and deep consideration of the topic ... it's worth reading more than once, and thinking carefully about.

The paper accompanied an invited (keynote) talk at the International Conference on Software Engineering. Two years later, Jackson and Zave published a journal paper which encompasses some of the same material. If you like this paper, you might (at your option) follow on with:

Zave, P. and Jackson, M. 1997. Four dark corners of requirements engineering. ACM Trans. Softw. Eng. Methodol. 6, 1 (Jan. 1997), 1-30. DOI= http://doi.acm.org/10.1145/237432.237434

Tuesday, October 21, 2008

Some test data, but ...

Amy exported some XML from a City of Eugene geodatabase for me, and I've put it in an accessible place, but I'm not sure how useful it will be. It looks like the individual layer files are basically empty of actual shape data. The full XML dump of the database is quite large (53M), but I'm not sure if it contains data that corresponds to layer data in the campus map. If you want to give it a try, it's at
/home/faculty/michal/public_html/08F-GIS-Eugene
on the Sun (ix) file system, or
http://www.cs.uoregon.edu/~michal/08F-GIS-Eugene
through the web server.

The .shx files are in a native ESRI data format, and the .dbf files are some kind of native database file ... the Unix "file" command claims they are in DBase 3 data file format, which is possible. The "file" command claims the .mdb file is a Microsoft Access database, but that seems unlikely. If there is data useful for testing, it is likely to be in the file XMLExport_GeoDataBase.xml . Warning, that's 53 megabytes of XML text with really long lines. Whether it has anything usable or not, I really can't tell ... trying to look through it with Emacs didn't get me very far, and it crashed Firefox!

We're hoping for real OSB data at the end of the week.

Friday, October 17, 2008

Why Keith's parser doesn't produce an XML file

A few people have been confused about how Keith Albin's stylesheet parser fits with the rest of the system. In particular, there is a "Main" program in the parser directory, but that "Main" program is just a test harness to make sure the parser can accept the grammar of the style sheet. Unlike the older stylesheet parser (used by the Red and Blue teams), Keith's stylesheet parser does not produce an intermediate XML file representation of the style sheet.

So what does it do? If you call it from that Main class, it does basically nothing ... it builds some internal data structures and then throws them away. But take a look at StyleSheetReader.java in the actionscript directory --- this is where the parser is really called. It builds those data structures in the readStyles method, and then uses the data structures in the applyStyles method.

I don't know this code inside-out, but I think the basic logic goes something like this: We read the style sheet and create some internal tables with (keyword, value) pairs (which may actually be triples ... there is something called a "modifier" and I'm not certain what that is). Then we read all the map "shapes" from an XML file. Then, for each of those shapes, applyStyles looks in the tables and figures out what attributes should apply, and sticks those attributes into the shape object. Then we generate code (actionscript or a data structure for the Java display engine, as the case may be) using the shape information and all its style attributes.

OSB data new ETA - next week

I got word from Amy that there were some problems getting the OSB data into the geographic database. The problems have been worked out, but the new projected time to get the data is next week. That's awfully late in the (first half) project, and I worry that you won't have time to deal with whatever surprises turn up when you're working with a new data set (and I can almost guarantee there will be some surprises, though I don't know what they will be).

On the positive side, this is a realistic experience ... dealing with some external dependence that was not resolved as planned happens all the time.

So how do we cope with it? Here are my thoughts:
  • It's worth at least taking a shot at handling the OSB data when we get it ... but I won't be surprised if the result of that is identifying some problems rather than solving them. Identifying problems is progress.
  • I've asked Amy if we can get some different GIS data, possibly from downtown Eugene, so that we can have some kind of test of the ways you are generalizing the input handling.
  • If all we can do is test the generalized input processing on the existing input data ... it's not what I hoped for by the midterm, but it's better than nothing.
So we go forward as best we can.

Monday, October 13, 2008

Recommended article on checklists

I highly recommend the New Yorker article "The Checklist" by Atul Gawande. (I considered requiring it, but decided for this term just to recommend it.)

Why would I ask you to read a paper about use of checklists by doctors and nurses in hospitals? Because the basic principles here (and even the practice of using checklists) apply also to software development. Creative, problem-solving tasks take a lot of our mental energy. Organizing the routine work and having well-defined steps that we follow in a set order helps us to avoid simple mental slips, and actually help us focus better on the not-so-routine aspects of work. Even so, professionals often chafe at checklists and other ways of organizing their routines.

Returning to my mantra that everything in software development is design: The problem being addressed here is that people are actually very bad at minding a lot of details. In fact, when we're "on autopilot" we tend not to even remember what parts of our routine we have and have not completed. It's both a strength and a limitation of the way our minds work, and we have to design our work methods accordingly. Making some things even more routine, with defined methods and even checklists, helps us make sure we're doing the routine stuff right, and makes sure at least a little attention passes over each of the things we ought to be paying attention to. Supporting the routinization of those parts of the work relieves some distraction from the hard, creative parts we want and need to focus on.

A common use of checklists is in code and design reviews. They can serve as a feedback mechanism, across as well as within projects: If we keep good records of problems that are found in a project (bugs, poorly defined requirements, whatever), we can mine those records to identify the points at which they might have been prevented or detected earlier and more cheaply. We put those on checklists, and work through the checklists at the appropriate points in future projects. These can be simple code organization rules like "Keep configuration constants together in a separate source file" or "use only relative file paths except in a single top-level configuration file". They can also apply in requirements elicitation, specification, and early design stages. For example, a good routine for checking that a requirement is well-defined is to require one or more test cases to be defined as part of the requirement.

Tuesday, October 7, 2008

Reading: Herbsleb & Grinter

Please read this week:

Herbsleb, J. D. and Grinter, R. E. 1999. Splitting the organization and integrating the code: Conway's law revisited. In Proceedings of the 21st international Conference on Software Engineering (Los Angeles, California, United States, May 16 - 22, 1999). International Conference on Software Engineering. IEEE Computer Society Press, Los Alamitos, CA, 85-95.

There are a couple reasons for reading this paper. For one, I want you to understand Conway's law (the relation between the structure of a software system and the structure of the organization that creates it), and in general the way technical issues in software design are tangled up with social and management issues. In addition, as software development is increasingly global, the cultural and communication issues discussed in this paper will become more and more important.

You should be able to download the paper from the ACM server, using the link above, when you are connected through the University. Let me know if you have problems with that.

What are my exams like?

Daniel asked the question, but maybe others would also like to know ...

My midterm from last fall is at http://www.cs.uoregon.edu/classes/07F/cis422/midterm.html

Here's the final exam from that term (which was a sit-down exam, while the midterm was a take-home exam):

Final exam, CIS 422, Fall 2007

You may write (legibly) by hand, or you may use your laptop computer and turn in a plain text file by USB. Text files should be named JohnDoe.txt (but use your own name), and your name should also the first thing I see inside the file. I need your name if you use paper, too.


1. We sometimes replace recall tasks by recognition tasks to reduce memory load. Considering this goal, very briefly describe why each of the following is or is not a good application of replacing a recall task with a recognition task in an online shopping application that is used only occasionally (say, once a month) by each user.
2a) Instead of typing a date, pop a calendar chooser.
2b) Instead of typing the user's state (e.g., OR for Oregon) in the delivery address, pop a menu of states.
2c) Instead of choosing a brown sweater by clicking on its picture (in a row showing brown, blue, and red sweaters), pop a menu with color choices.
2d) Instead of typing "sleeveless turtleneck" in a search interface, check the "sleeveless" and "turtleneck" boxes in an array of style choices.

2. "Visibility" in a software development process means being able to assess how we are doing at each point along the way. The general concept of visibility applies both to schedule (are we on schedule? how far behind or ahead?) and to qualities like maintainability, usability, and dependability.

Visibility is a major challenge in project planning. Briefly describe:
• A visibility issue faced by your project team (whether or not you successfully addressed it)
• One or two actions your team took to address that challenge, AND/OR
• One or two actions you wish your team had taken to address that challenge
To the extent possible, describe how the actions your team took or might have taken relate to processes we have discussed in class.

Saturday, October 4, 2008

Field trip?

Bob Disher, technology specialist at Oregon School for the Blind, asked if we would like to visit as a group. I think that could be an excellent thing to do, but I have a lot of questions about how and when to do so, and of course I will need to know how interested CIS 422/522 students are in such a field trip. And of course there are the practical issues of transportation and paying for it (e.g., if we rent one or more vans for a field trip).

Regarding scheduling, I can think of three plausible times to go:
  • As soon as possible
  • Mid-term, when we have initial prototypes to show off, so we can solicit feedback and ideas
  • At or near the end of the term, when we have more to show off (but feedback and ideas are more for future developers)

My initial thought is that (a) trying to go right away is probably hopeless ... the earliest we could organize it is mid term and (b) a term is so short that we would probably get most out of it at the end, perhaps Monday or Wednesday of dead week. Those are just initial thoughts, though ... let's talk about it a bit in class Monday.

Quantitative and ordinal attributes

In discussion of what the OSB map should include, two quantitative attributes were mentioned:
  • Incline (e.g., slope of a path or lawn)
  • Width of a sidewalk
Incline is the most interesting of these, because unlike the width of a sidewalk (which would be at least represented by width of an area in the map display), we currently don't have a way to represent incline at all. Note that it is distinct from, and perhaps orthogonal to, the kind of object — a sidewalk can have an incline, and a lawn can also have an incline. Moreover, incline is relative and directional: We care about grade, not altitude, and the same location can be inclined upward, downward, or level depending on which direction one is walking.

In tactile maps, there are symbols indicating incline.

The current soundscape map system has no provision for such attributes, nor is there a way to describe them in the style sheet language. My first thought is that we probably need ways of modulating existing sound cues (e.g., altering pitch) in a way that maintains the meaning of the cue ("this is a sidewalk") and conveys additional information. It might be ok to discretize quantitative attributes, for example representing incline only by one of the discrete categories "steep down", "down", "level", "up", "steep up".

We don't absolutely have to come up with ways of handling such attributes this term, but it's on the "wish list" for things that you may consider if it fits in your development schedule. As I said in class Wednesday, your project should be structured as a sequence of increments, so that the question is not when you are done but rather what you get done. (This is a very common approach to system design and project planning when deadlines are inflexible.)

Notes from OSB visit, 3 Oct 2008

I accompanied Prof. Amy Lobben to Oregon School for the Blind (OSB) in Salem Friday, along with a geography student who is gathering raw data for the OSB map using a GPS unit. We met primarily with Bob Disher, Technology Specialist at OSB. Here are a few notes on that visit and what I learned.

The OSB campus is smaller than University of Oregon campus, and the map will be quite a bit simpler, with fewer layers (classes of object). Objects to be included in the map include at least the following:
  • Buildings
  • Some rooms within buildings
    • Auditorium, media center, main office, infirmary, dining hall
  • Some main doors to outside
  • Parking lot
  • Sidewalks
  • Campfire pit (a meeting place)
  • Sensory garden (as a single object)
  • Major features of the immediately surrounding area
    • Cross streets
    • Intersections
An interesting issue that came up, and one that has not been considered in our soundscape map work to date, is representing attributes like incline (slope) and the width of sidewalks. I'll make a separate post about that.

In discussions about longer term trajectory for the project, Bob asked about producing maps from data that is not in a geographic information system, e.g., from a plan of a building that is in terms of relative positions but not referenced to lattitude and longitude. This seems conceptually straightforward (it really shouldn't matter to us whether the coordinate system is degrees or feet, or where the origin is), but it does suggest that our earlier discussion about not tying our input data format too tightly to a fixed GIS data source is in fact relevant.

In a separate discussion with Amy, on the way to Salem, I asked whether we should be considering GIS systems other than those of ESRI (the company that makes ArcGIS). Amy said that, while GIS systems sometimes do provide some limited interoperability by reading other data formats, in practice ESRI dominates the field so strongly that supporting ESRI formats is enough for handing pretty much all GIS data we are likely to encounter.

The meeting was quite loosely structured, and in addition to needs for the current project, we discussed the way the project has evolved and the way it involves students. Bob Disher extended an invitation for us to visit OSB as a group. I'll write a separate post to discuss that possibility.

Tuesday, September 30, 2008

Prior implementations available

The project description page (here) now has links to the three prior implementations we will use as starting points.

The Red Team and Blue Team implementations are both Java implementations. One of them was a bit more ambitious in using a quadtree data structure for hit detection. The other uses a simpler hit-detection method, but performed adequately anyway. You will need to choose between these as a starting point for the Java display code.

Keith Albin's thesis project produces a map in Adobe Flash. It has a richer, more convenient style-sheet language, which is described in his thesis (file dsl.pdf) in the "docs" directory. He refactored and reworked some of the code for processing input data, and I'm pretty sure you will want to use his code rather than either the Red or Blue team code as a starting point for the front end.

Monday, September 29, 2008

Team assignments are out

I just assigned teams and sent email with team assignments (5 groups of 5 students each). If you did not receive an email message telling you who your teammates are, something went wrong ... drop me a note.

I have labeled all of these assignments as “tentative” for the moment, just in case there is some big problem I didn't notice or anticipate, but otherwise I expect these will be your permanent team assignments.

Something I forgot to mention is the possibility of trades. A trade is when one group says "you can have Joe and Sally and we'll take Ebineezer and Bertha". The catch is, both groups and the people to move between groups have to agree (e.g., it doesn't work if Ebineezer doesn't want to change groups).

A few students were interested in a game project for the second half of the term. This might involve a bit of moving people around at that time. I tried to group people so that the disruption will be minimal.

Let's try a blog for course announcements this term

When things get busy, I typically have trouble getting announcements up on the course web site quickly. Using a blog for news and announcements might help, but cutting a couple steps out of the process, and the option of posting comments (questions, etc) might also come in handy. We'll try it, and we can always go back to the old way if it doesn't work well.