Transcript

INTRODUCTION

Robert Hess, Group Manager for Microsoft Corporation and Show Host.

ROBERT HESS: Welcome to another episode of the .NET Show. On this episode we're gonna cover WinFS. Now, in the previous episode we talked about Indigo, which is an aspect of Longhorn that maybe might be a little esoteric. But WinFS deals with the file system, but this file system isn't quite the same as you're used to. We'll be discovering a lot of new capabilities and new features that WinFS is going to bring to the programming environment. But first, let's check in with Erica and the news.

MSDN NEWS UPDATE

Hosted by Erica Wiechers, Program Manager.

ERICA WIECHERS: Hello, I'm Erica Wiechers. Welcome to the MSDN News Update.

Security Guidance Center In February, Microsoft launched a new Web site dedicated to providing guidance, tools, training and updates to assist developers and IT pros in planning and managing a security strategy for their organization. The new site also provides information on key security topics, How-to articles, and a place to sign up for security notifications.

Bill Gates' Vision to Stop Spam Also in February, in his keynote address at the RSA Conference 2004, Bill Gates announced a detailed vision and proposals on how technology can be used to help put an end to spam, including outlining the company's Coordinated Spam Reduction Initiative (CSRI) and technical specifications for the establishment of Caller ID for E-mail. The Caller ID for E-mail proposal aims to eliminate domain spoofing and increase the effectiveness of spam filters by verifying what domain a message came from.

BizTalk Server 2004 Release On March 2, Microsoft unveiled BizTalk Server 2004 which is a member of the Windows Server System. BizTalk Server 2004 helps increase the productivity of information workers, IT professionals and developers with role-specific tools for developing, managing and accessing business processes in familiar environments such as the Microsoft Office System and Visual Studio .NET 2003.

New TechNet Web site Also in March, the Microsoft TechNet Web site re-launched with a new design. Some of the major changes include a scoped search, retirement of the deep-tree navigation system, interactive page tools, and the removal or archiving of older content. Check out the new site at microsoft.com/technet.

And this has been the MSDN News Update. I'm Erica Wiechers.

For more information on the above news items please refer to the following:

Security Guidance Center: Check out the new Security Guidance Center:; http://www.microsoft.com/security/guidance/default.mspx

Bill Gates’ Vision to Stop Spam: Read the press release

BizTalk Server 2004 Release: Read the press release; BizTalk Server Web site:; http://www.microsoft.com/biztalk

New TechNet Web site: Check out the newly designed site:; http://www.microsoft.com/technet

TECHNOBABBLE

Robert Hess meets with Anil Nori, Lead Architect for WinFS, and Quentin Clark, Director of Program Management for WinFS.

ROBERT HESS: Welcome back. Like I said, today we're gonna take and focus on WinFS. WinFS is the file system that'll be coming out in Longhorn. There's an awful lot of changes involved that programmers and application developers need to understand in order to properly take advantage of that. To understand what those issues are, I have with me Anil Nori and Quentin Clark. Thanks for joining me.

QUENTIN CLARK: Thanks, Robert.

ANIL NORI: Thanks.

ROBERT HESS: In the WinFS scheme of things, what exactly are you guys responsible for?

ANIL NORI: I'm the Architect for WinFS. I look at the overall vision and overall strategy, both in the near term and the long term. I also focus a lot on some of the technical details, how the system is put together.

QUENTIN CLARK: I'm the Director of Program Management, so I run the team that is involved with all the design specification and customer management aspects of the Product Team.

ROBERT HESS: Now, WinFS is, like I said, the file system for Longhorn. It's actually a lot more than just a file system. What is the in-a-nutshell commentary about what you think WinFS is and why people should be paying attention to it?

QUENTIN CLARK: WinFS actually came about because of a lot of different needs. It's wasn't a simple matter of trying to take the file system and make a couple of simple advancements. A few different factors played in. One of them was, end users were starting to struggle with searching, and this wasn't solvable in a straightforward way. There are different utilities out there today that you can use that plug into applications like Outlook and plug into the file system that's in Windows and allow you to start doing searches across these things. But if you have a variety of different applications, there's a never-ending series of utilities you'd have to create that are able to understand all those different data sources. So increasing the productivity for both consumers and knowledge workers in IT spaces was one of the motivations behind WinFS. Is there a way we can structure a lot of this data so people can find things a lot more easily? That was one. Another is, a lot of the development community's coming to us and asking us to expose new primitives in Windows. They make comments like, Why can't I get an address for shipping from Windows? Somewhere the user has input this information and right now it's in all these different applications, whether it's ACT, Eudora, or Outlook, or Notes, or wherever, or the Windows Address Book itself; why can't there be a Windows way of doing that that everyone can use that I can then rely on that as a developer, where my thing isn't about contacts; my thing's about shipping or it's about buying things, or whatever, and I need to be able to interact with contacts, not manage them.

ROBERT HESS: I mean like Outlook has an API you can use to call in and get the contacts out of it.

QUENTIN CLARK: Sure, but any application developer that needs to understand the 15 different popular PIM management solutions and the APIs there, and they're all very, very different. So providing a common platform for a set of things, contacts, calendar items, user tasks, media types, etc., gives a level of power and new capabilities for all developers for Windows.

ANIL NORI: Yeah. I think the underlying theme here is really sharing also, because it facilitates sharing, because that's the fundamental shift where today every app has the data locked into their silos, and we want to open it up so that, I just wanted to find one contact and it is schematized so that every app can understand it. Not only Microsoft apps, but also ISV apps can leverage a lot of the PIMs and the PIM data that is actually in the platform.

ROBERT HESS: Wasn't that kind of the notion behind extended MAPI or something like that, having a format that applications would support and expose information, or even like the OLE structured storage and stuff like that? Why is a file system even coming in and playing that role?

QUENTIN CLARK: Well, I think we'll get to there in a second, but let's finish the first topic of what happened? Why was WinFS invented in the first place? Because the third topic of that really is about developers telling us they wanted a stronger data story inside Windows for their own specific types. Not only do they want to be able to integrate with something like a contact or a calendar item, but they want a rich, relational storage for their own data types, wherever they happen to be. Most applications use data, and managed data is the whole point of the application. So they've been asking us for a richer database platform inside Windows. Getting back to your point, how does this all kind of cook down? We looked at all these needs, talked to a lot of developers, looked at a lot of systems that are out there and decided to develop a WinFS data model to solve all these needs at once. Back to your point about, we kind of attempted this sort of thing before and you could imagine OLE Doc properties being a place where you could put extended data in and you could develop common file formats for something like a contact. The problem is, there hasn't been a rich programming model around these things that's really extensible in all the right ways. In order for a contact to be both a Windows thing and something that Outlook can use, as well as something that Notes can use, means you have to allow them to extend those things as well as plug in at the programmatic interfaces so their own unique application logic can be part of what it means to be that particular instance of one of those data types.

ANIL NORI: Yeah. I think one thing that I would add to that is, adding a strong type system and then adding a schema to it, yeah, you can have OLE Doc properties, but if it is actually a document I want to see certain properties; if it is a picture then I want to see different properties; if it is a contact I want to see different properties. So we want to define a richer schema also that enables, again, application integration.

ROBERT HESS: Making more of a platform then, whereas like OLE structured storage wasn't really a platform; it was more like a technology a people could use.

QUENTIN CLARK: Yeah. The other sort of angle when you look at this is, when you start to have schemas, what else can you get out of that? Suddenly I can ask the system questions like, Is there a way for me to take a document and a contact and create an authoring relationship between these, because now I have all this strong schematized notions of what those things are? Can I start to look at things through relationships, whether they're explicit things - why are two things together, whether or not they're query-based things, like find me all the pieces of mail from Robert in the past week with PowerPoints in them, or with PowerPoints that are about WinFS in them. You start to be able to answer those question because of that schema, and that raises the level of power that developers now have at their disposal quite a bit.

ROBERT HESS: Now, if we compare WinFS to the current NTFS, current file system sorts of things, today we've got a file system that supports folder structure, hierarchal structured storage. It's got some metadata associated to it, whether it's the date, the time, the file name, the extension, the file size and stuff like that. Is WinFS simply evolving that, extending upon it? Is it changing it dramatically, and if so, how are we going to deal with those changes in our programs and in just the way we use the computer?

ANIL NORI: I think WinFS will support, first of all, we will support the backward compatibility. Suppose if you have Win32 applications, they will continue to run. From that perspective, wherever you have files, you can actually put it into the WinFS namespace and they all will work. But we go way beyond that in a sense that not just certain creation dates and those properties; you also want maybe sometimes properties based on the content of the document, or content of the file itself. We have mechanisms where you can pull the metadata, we can pull actually certain relevant data and then make them as structured properties, and again, adhering to certain schema so that you can then reach queries. As Quentin pointed out, I want to be able to say, Find me all the documents that are authored by Anil Nori, and who hierarchically lives in a certain ZIP code. You can actually write much more complex queries.

ROBERT HESS: It's kind of like what we have right now, NTFS has a file schema associated to it, which is very limited; it's not extendible, and it doesn't search very easily and doesn't have an awful lot of information. You can search on file name and file date and maybe file size and stuff like that, and you're saying that these new schemas will just allow schemas to be not just file schemas, but document schemas and calendar schemas and contact schemas, and they'll be extensible, and the searching model used on that will be one that's highly tuned to make it even faster to search all that extended data than we currently have with the standard NTFS type of search.

QUENTIN CLARK: That's right. I think a big part of WinFS, which it'd be great if you'd talk about a little bit is, we have the ability to take care of not just the unstructured data that file streams are today. Let's face it, something like a JPEG, the heart of that is a series of ones and zeros you feed to a decompression algorithm, and we're never gonna structure that thing inside.

ROBERT HESS: Find me something with purple pixels in it.

QUENTIN CLARK: Yeah. You could do color histograms and that is structured and you could go query that, but the actual stream of ones and zeros that digitizes the picture of us in this show is not something you can further structure very easily. There will always be things that are always gonna be the stream kinds of data, unstructured data. But WinFS also allows you to have structured data as well as semistructured data. Being able to manage all these kinds of data and search across them is one of the key design aspects.

ANIL NORI: Yeah. And actually bringing them together. Just to give an example, let's say I have a person or contact and then I want to put in addition to just the contact structured properties, I want to put the picture of the person, and then I also want to put, let's say, my resume add to that. Resume could be semistructured, because it's not fully just zeros and ones, there is some structure to it, experience and education, background. So all of those things are some structure there. I should be able to come and query that structure then say, Find me all the people where in the experience column they worked on some Windows experience, and then maybe they have some Linux experience also, and then I also may want to do something on the unstructured property, like in the picture find if this color is there or if somebody was wearing glasses or things like that. You can query across structured, semistructured as well as unstructured data in the same query.

ROBERT HESS: How fast is that though, and how are we taking and dealing with the speed issue? Right now, on my system if I want to take and search for all files that even have this in the name of the file, and I know that's a very limited set of information there, it still takes almost forever sometimes it seems like.

ANIL NORI: Yeah. This is where I think we fundamentally, now we are coming to the technology that we are using underneath the relational technology. It's pretty mature that we have shown that you can actually run large number of transactions and so we have really clever mechanisms for indexing as well as for querying. We are leveraging that underlying relational store to do all these things very efficiently.

QUENTIN CLARK: I think a big architectural difference between what we have today in the file system and the way WinFS is being built is, WinFS as a file system, as you are making changes to data, we know how to maintain this query space, these indices, etc., so we're not chasing anything around. We're not chasing around the truth; we're managing the truth. As you make changes we know what those changes are, we know how to change indices so searching back on those is effective. These metadata handlers also can be invoked in order to help ensure that changes to stream data get reflected up in the structured space correctly. We've architected this whole thing in mind of, we will manage what is, in fact, the truth in the system as changes occur, we know how to update indexes, etc., so that when queries are made we can respond to those very, very efficiently.

ROBERT HESS: We're not doing a brute force search. We're not just taking and saying, Okay, the guy asked for a search, I'm gonna start from scratch and just start searching my hard drive, like we do today. Which is probably fine on a floppy disk; if you're doing a brute force search it's great.

QUENTIN CLARK: Yeah. The way that relational technology in general works is, you build indices for things you're gonna be searching for on a regular basis. There's always the ability to go into individual pieces of data to find things in sort of a more brute force fashion, if you will, for things that are not indexed, but we're touting the user experience and what we expect people to do with the data to ensure that search for common things is very fast. Things like, Show me all the mail from someone that I've gotten this week, is something we know people are gonna ask for, we know we can index that and make sure that that responds very quickly. Further, part of what we're trying to figure out how to do is, how do we make sure that things we're not expecting to be searched that end up being searched pretty frequently, how do we develop a response to that, so that we can make those searches much more quickly?

ANIL NORI: Yeah, and also when you talk about some of these performance issues, people think that, Oh, you're using a database, that means it's gonna be heavyweight, but the relational systems and databases are really optimized how to use the IO very well. Even when you really do lots of writes, we actually have lazy way of writing things to the disk, but still not lose your transactions. So this notion of a log, so you write to the log and then push all the data in later on. You spread your workload around, so you amortize it rather than you suddenly choke, and then stop, and then you have to write it; you don't have to do that. Databases have really done a very good job of doing that, and that's what we're leveraging.

ROBERT HESS: A lot of it is just trying to be as intelligent as possible and making sure that it's done properly using the technology that's advanced for either databases or even Web searches and stuff like that.

QUENTIN CLARK: Absolutely. I think a big factor in WinFS is that we are building on mature relational technology -- and relational technology, you've been around since the beginning, you've been working on relational technology for decades now - and on NTFS. We're not throwing these things away, but we're figuring how to use them, marry them together in order to get the set of capabilities we want.

ANIL NORI: That's a good point. When you do put an unstructured, let's say I have my picture I am throwing in my document as a file, we're not forcing to push that completely into the database. We're integrating the NTFS as file streams with the database. Your existing read/write operations would work at the same speed that what you're used to through Win32. But still you get the transactional integrity across your structured data, as well as the unstructured data. If I make a change to my contacts information, as well as I change something in the document, some of it goes into the NTFS, the file stream, some of it goes to the database, but we guarantee you that it is atomic and you don't lose any of that information. In some sense you're leveraging the best of the both, both the database stuff as well as the file stuff. Take the good stuff from both and then integrate that. This is the reason that I believe it is possible, and as Quentin pointed out that, well, in the previous times people have tried to do this, our ways are different, but I think here we are using the technology, you are putting it to use and then saying, use what database is good for structured stuff, use a file system for unstructured stuff and then bring it together so that you get a seamless view of how you access your data. And then you can search across both kinds of data.

ROBERT HESS: Now, you mentioned a few times about having the schematized data with contacts and the e-mail and documents and stuff like that. I assume that we're coming up with some schemas that we feel are robust enough to identify some of those key areas. What are some of the opportunities that ISVs have for dealing with some of that schematized data, either extending it themselves, like some application might be doing something interesting with contacts that we didn't quite take into account and we'd created the schema for it; another application might say, well gee, I'm needing to do a schema for restaurants, and you don't have a restaurant schema.

QUENTIN CLARK: Yeah.

ANIL NORI: Yeah.

QUENTIN CLARK: And I don't think that will be one of the foremost Windows schemas, the restaurant schema. But you take that specific example. A restaurant management app, say, and they're gonna need a variety of different things. They're gonna want to extend contacts in particular ways to become restaurant patrons, I guess is the right word. They have the ability to add data, to add new properties, to extend contacts in that particular way. They also have the ability to add methods to that as well, so if they want to add the method of, I was just visited by this contact, this contact was here today so I want to set up a bunch of stuff about that, they can add new code to what it means to be a contact and invoke that in their application. They can also add code to existing operations on contacts. So if I'm in the Windows Explorer and I see that contact and I delete it, then the application that extended that contact has the opportunity to participate in that operation. We think it's very important to the shared data concept in making that successful, especially from an end user perspective. When they get to see all this data as their own, how do the apps still make sure they participate correctly in that data change?

Then they also get to define whole new types. They could have the menu item type, or whatever other restaurant data there is, maybe the bill type. They can define new relationships. They can associate the bill, the receipt of what the person ordered and that kind of thing, to the contact, and the contact to a calendar event. So suddenly I can do queries like, show me everyone that was here last night. Or show me all the wine that was purchased last night. Or this patron's coming in, they're a very common customer, show me the last five times they were here, show me the receipts from them so I can maybe even do things like, well, they always order the same appetizer, why don't I just go ahead and put it in the kitchen so it's ready for them when they ask for it. There's a variety of different things they can do just by adding to the schemas, as well as developing new schemas and new relationships between things, and allowing their applications to really leverage those.

ANIL NORI: I want to add a few technology points there. One of the important things in WinFS is having a data model. We went and defined data model saying that, to be consistent with the .NET CLR in a data model, we said you should be able to define new types and then you can provide inheritance for these types and you can also provide relationships between different types. To enable the ISV extensibility, you can also add new extensions. Let's say I have a contact, and multiple ISVs want to extend that contact, and then we don't want to create a mess of going and then adding properties to the existing type itself. We define a well-defined way of defining your own extensions. ISV 1 comes and defines their own extensions, ISV 2 comes and defines all their own extensions. But they all are hooked to the contact item. So when you look at the contact you can go and then inspect saying that, I am an ISV 1, I want to look at my extensions. You can find those things. We thought about extensibility as a forethought rather than an afterthought. Then we put hooks for that, and it is not only just at the adding properties, even adding behaviors, because at an instance level, on one particular mail message, some mail client may want to write a behavior in a certain fashion. At the same type but at a different instance, some other ISV may want to add behaviors. We need to be able to add behaviors at the individual mail message level, not at just at the type level. We call these things the bindable behaviors. We have provided those also. There's a lot of thought went into both extensibility as well as sharing. But having said that, it is a new thing. Nobody has done these things. It is a challenge. We still need to get a better understanding of how people will share this data across multiple applications.

ROBERT HESS: Will they share it? Will some company that wants to be the e-mail program, will they take and not be storing the e-mail in our schema, just simply because they want to own that within their own proprietary data store?

QUENTIN CLARK: Certainly that possibility exists. In fact, that flexibility exists a priori by how we've allowed extensibility to occur. A particular mail application could say, I'm not gonna leverage the Windows mail type, I'm gonna just create my own type for mail and deal with that, etc., and they can provide their own shell experience over that type as well. But we think there's so much value in sharing these common types and we are hoping that we've provided enough extensibility mechanisms so that that really is possible and meaningful and doesn't incur a huge burden for ISV developers. We'll find out at beta when we start to get the feedback whether or not we've actually hit the mark there, but it is absolutely something that we've been designing in from the beginning.

ROBERT HESS: Plus, hopefully it'll be the expected behavior of applications on the system so if a user installs an application that doesn't play well with the existing schemas, they'll say, Oh, this apps broken. It's like installing a Windows app today that doesn't have a cut/copy/paste in the menu anywhere because they didn't want to take and share their data around.

QUENTIN CLARK: Yeah, I think that's exactly right. I think users will become expectant of this shared data world so when they have their contacts in Windows, in this particular form, and they install an application that demand the contacts live in some other form, they're not gonna enjoy the experience of that application because it's disjoined from the rest of their system experience. We're aware of that, but that's sort of a stick, not a carrot. Right? We really want the carrot for the ISVs to be we've done such a good job in defining what these things are and providing rich extensibility around them that they can really adopt that shape and extend it in the ways they need and get a lot of value out of that.

ROBERT HESS: Now, we've talked a bit about the individual data types associated with files and information in the schema, stuff like that. What if we step back a little bit and just talk about the actual files on the system itself. Right now with Windows, ever since even DOS days had the hierarchal storage structure.

QUENTIN CLARK: C:/etc., yeah.

ROBERT HESS: Yeah, file folders and I guess DOS 1.0 didn't have folders, but DOS 2.0 added the folders finally so we actually stored some folders. When I'm doing demos of Longhorn and showing people that files no longer really are stuck to the hierarchal storage, they start getting concerned. They start wondering, You're taking away my structured storage, my ability to have a folder that's called Photographs and a folder in there called Holiday, and a folder under that called On Vacation or something like that. How are we dealing with that?

QUENTIN CLARK: There's a few things I think we're saying about this. One is, we're certainly not taking away the ability to have a container. Right now we're calling them lists and it's a folder-like construct and we're even allowing items -- I don't use the word files very much, but I use items -- and some items are file-back and so we service them through Win32 and all that stuff for backward compatibility, but we allow items to exist in more than one list at a time. And we've given this capability so that you can actually have more flexibility in how you're organizing things than you do today. One of the things I often experience at home is, I'm sort of your typical Microsoft person and I have probably more IT infrastructure at home than I'm supposed to, but I have this server where I back up all of our photographs. I keep a log of this; it runs every night. Occasionally I look at the log just to make sure things are moving along. Every once in a while I'll see that every photograph in my system has been moved around. And it's because my wife has chosen, I don't want to arrange things in folders by years and months, I want to arrange them by which child they're about. Suddenly everything gets moved, and then three months later will go by and she'll rearrange them back to the way they were in the first place, right? She's struggling with a single way for her to think about this data, and she can't find one way that works well for her.

With WinFS, you could have all of that organization in that multiple different ways as you want it. You can have things arranged by year, and arranged by who's in the pictures or whatever. So we're giving you that power and we're even extending that power. But further, because the data's schematized, and because we have the ability to relate data to each other, we think a lot of the needs for that disappear. In fact, it turns out with Longhorn my wife won't be putting a photograph into both My First Son's folder and the This Month folder. She'll relate that photograph to the contact of my son so she can now see all pictures through that relationship of that contact, and she can also see pictures by date. Just based on the data that's in the system, the relationship you can create between things, some of the needs for organization kind of go away. But it will allow people to kind of deal with the system as they see fit.

ANIL NORI: Here is another area. The whole space of our organization. There's a lot of innovation that is happening. First, file systems are good for organization. As you said, I can have a hierarchal folders, so I know how to organize these things. And then database systems have been somewhat rigid. You put it into a table and then it's very rigid. We said, let's leverage both again. But we went one step ahead saying that, to enable a lot of sharing, I want things to appear in multiple folders or multiple lists. You and I may share the same presentation, then I may want my PowerPoint file to appear in your folder as well as mine. We said, that's an absolute thing that we need to do. These are not just hierarchal folders or trees, but you can have items appearing in multiple of these folders and lists. That's the first thing.

ROBERT HESS: Because they're no longer physical folders; they're more like a query almost.

ANIL NORI: It can be query, even if it is physical. Through these relationships we can say, this particular file is related to your folder as well as my folder.

QUENTIN CLARK: It's both.

ANIL NORI: And the second point is, suppose I did all of this organization, but I have no clue, I forgot where I put it. You can come to the store and then I say, I put it somewhere, but it's called this or I know certain properties of those things, you can fire a query and then find it.

QUENTIN CLARK: Yeah. In fact, we're doing work so that there's a search bar that will appear in the Explorer and it's a natural language query interface. You'll be able to type, PowerPoint presentation I wrote last week and sent to Robert, and I have no idea where it lives in some folder. I don't know much more about it except I do know that much. I know I wrote it, I know when I wrote it, and I know who I sent it to, and that answer can come back because we have the natural search capability that's integrated and understands the structure of these things and can create the correct kinds of queries in WinFS.

ANIL NORI: So you can have organization, which is you manually put them into multiple folders, or you can do it dynamically through queries.

ROBERT HESS: Because even if you think today, we talked about the metadata attachment NTFS file is just the name, the size, and the dates to look at. Well, conceptually, the metadata also is the subfolders it's into, and where on NTFS that's a static set of metadata, with WinFS that's gonna be a dynamic set of metadata.

QUENTIN CLARK: And further, that path is fragile. Right? As I'm sure everyone in this audience certainly and you've experienced, sometimes you copy a path to something, send it to someone else and then some time later you decide to move it because you want to rearrange things, organize things differently, think about things a little bit differently, and suddenly that path no longer works. Well, we don't have this problem. We have a stronger notion of identity in the system for every item that's there, and we allow this great flexibility in terms of how things are organized, and we can still provide the ability to find things independent of how it's organized. You mentioned before some of the user experience things we've shown, showed the PC for example. It looks like this very sort of virtual world, it's this very query-based thing, show me all photographs, show me all photographs stacked by who's in it and that kind of thing. Well, that's actually operating over this physical organization. You can change that physical organization independently of your desire to view things through the property space, if you will.

ANIL NORI: I think we can't go into details now, but Shell is working very closely with this whole model to visualize both these query-based as well as sort of statically list-based approaches.

ROBERT HESS: Yeah, because you guys aren't doing the visual representation of your data. You're just simply having the architecture underneath that allows someone else to come by with this cool way of visualizing it.

QUENTIN CLARK: We have no UI, we're store guys.

ANIL NORI: That's right, yeah. We're providing the core, the data platform, and Shell is providing the whole rich, the whole user interface underneath and the whole user model itself.

QUENTIN CLARK: Our space ends at the WinFS APIs, which are part of the WinFX APIs. There's this ability to come in, find items, change them, sort them, group them, all this stuff is a fairly powerful set of capabilities at the API layer, and that's where the Windows Explorer comes in and uses that API side to visualize the stuff that's in the store for users.

ROBERT HESS: Yeah. A couple times we've touched upon the notion of storing some shared information, you storing a presentation, me storing a presentation, something like that. How does this work in a server-based world? We're no longer talking about just my SQL and hard drive; we're now talking about a hard drive that might be on a server some place. Sounds like there's gonna be some issues there.

QUENTIN CLARK: Well, issues is a leading term, but in WinFS the direction, the overall vision we have includes a set of server services as well. Longhorn, the clients, the first step in getting this vision of managing everyday information in a much more value-added sort of way and providing a new developer platform for developers on a desktop, but the longer-term vision for WinFS has a very strong server presence, in fact, it's required really to complete the WinFS vision. We imagine not just end user data, but all sorts of data relevant to businesses being stored in WinFS servers, being modeled in the WinFS data model. The server component of this is going to enable all these other kinds of sharing scenarios, where today we use a simple //machine name/share, you'll have a similar kind of concept to be able to service the richness of WinFS data in those same kinds of ad hoc manners, if you will, as well as services built on top of that, and we envision rich, server-based applications around document management, mail management, and all this stuff, as the technology rolls forward and as that strategy unfolds.

ANIL NORI: I think in the near term, I also think that people will write WinFS as a rich synchronization mechanism through which people can synchronize, take the server files and then offline them onto your client.

ROBERT HESS: What do you mean by that, synchronization? We haven't touched upon synchronization yet.

ANIL NORI: Yeah, we haven't talked about some of the components of the WinFS itself. We primarily focused so far about the data model or what kinds of data you can store, but there are other capabilities like, if I have WinFS on multiple machines, I want to be able to synchronize my data. Let's say I have my desktop and then I have a machine at home and then I have a laptop. So I want the data of my documents to appear on all these machines. So you can set up synchronization relationships. When I would make a change in to one place, we automatically propagate those changes to all the other machines also.

QUENTIN CLARK: This will work client-to-client. You and I can decide to have a sharing relationship between a set of data, as well as client-to-server as a server strategy unfolds. It's actually fairly powerful notion, WinFS-to-WinFS synchronization is. Imagine being able to model data in a certain way and define how you want to allow data to be granularized, if you will. I could define it as a solution for the agenda for this discussion, for example. I could decide that who's gonna be there is one change unit that can be tracked and dealt with independently, then say the individual topics. The individual topics can even be changed independently so you could decide, Oh, that third topic, I need to change this to be something else. And I could decide, Oh, I actually need to have Anil there, and we could make those changes synchronize and those changes won't conflict with each other. Because of the structured data in the schemas, we get this much more granular level of synchronization than has ever been possible before. You can't do that today with the unstructured data. If you change the title of a JPEG, that whole stream's basically got to be manipulated.

ROBERT HESS: Even though nothing else had changed?

QUENTIN CLARK: Even though nothing else had changed.

ANIL NORI: Yeah. Using this synchronization mechanism, even if you don't have WinFS on the server, somebody can write what we call a sync adapter where they can go in and then pull some files, and then push it into the client. You manipulate on it and then you can push it back. The policies, how you check in and check out of the server, whatever application that they're using, whatever document management system they're using, they can plug into that.

ROBERT HESS: Now, you guys have spent quite a while working on WinFS and thinking through the issues and trying to make something that's a good programming model and thinking about, what would an application really want to do with this sort of stuff. I assume maybe you've also maybe spent some time thinking about, what's an application you'd like to see come out that uses WinFS? A few people have commented on my blog saying, That's what I want to find out; what is the killer application for WinFS? Do you have any input on that?

QUENTIN CLARK: Well, I actually get asked this question quite a bit, as you might imagine. My first answer sounds like...

ROBERT HESS: Of course, I think partly the audience is interested so they can quickly write that application and make a lot of money on it.

QUENTIN CLARK: Yeah exactly. I think the first part of this answer is, we don't know. In the Windows 95 team, which was developing the last major huge platform breakthrough change for Microsoft, they didn't envision IE and Outlook and Netscape, and these apps would become the killer apps for that platform. Similarly, we know we don't know yet what those killers apps are gonna turn out to be, but we do suspect some things about it. The app will take advantage of the data model in a very rich way. Simple things like, what is the daily life of a consumer at home trying to manage their family and their family communications and why do I miss birthdays, and why can't I find the video footage of my son's first birthday, and why is all that so hard, and how come I can't relate these things together? How come I can't know when I'm planning a dinner party that my friend Henry's allergic to cashews when I'm planning the dinner for that? The data's out there, and building an application that will be able to actually take advantage of that finally and allow people to capture that information and do something with it.

Part of this kind of goes back to, there was a version of Office that you guys might remember that when you'd save the file it would bring up the file properties dialog. Remember this? It would always bring up that dialog and you had to fill this thing out or you'd turn tools options and like six tabs and you could turn off the thing that says, Stop bringing that dialog up. Well, the reason that everyone turned it off and the reason we stopped doing that in Office, was because we didn't do anything with that data afterwards. Today, for me to capture in my Outlook contact that this friend of mine's allergic to cashews, it doesn't really help me that much. I don't get to reuse that very much. But imagine I have an application that's about managing my home life or doing party management or whatever and it can now tap into that data. Well, suddenly it becomes more useful for me to have input it in the first place. So capitalizing and leveraging data that people input, and people take the time to put in the system, that's the kind of app that's gonna become killer apps for the sort of WinFS look at Longhorn. There's a whole bunch of other things around how Indigo's gonna be used and how Avalon's gonna be used and the fact that I can pick up WinFS items and use Indigo in a very easy manner to get things across to other machines and other applications. That opens up a whole new set of possibilities. But for WinFS, and so the way we think about the world, taking advantage of the data model and encouraging users to put information in and actually giving them benefits as a result, that's what's gonna cause a real change.

ANIL NORI: I sort of see two tiers of applications. One is, obviously over time within Microsoft itself a lot of applications would target WinFS. You have Media Player, then you have Outlook and then you have WSS. Over time there's a good target. Second thing is, for a lot of ISVs, we suddenly have a lot of the database-type of applications that you can start developing. You brought up the restaurants. It's a great application to write a restaurant management application. There's was somebody suggested maybe a wine cellar application. I have different types of wine, I want to organize all these wines and then I have to have wine lists and then who the makers are and when, so a lot of the database type of applications, suddenly we open a floodgate that you're gonna start developing those things. Because they were not enabled by the file system so easily, and suddenly the structured data, as well as the query, gives you that capability now.

ROBERT HESS: The main difference between a restaurant application and a wine cellar application on WinFS than what we have today, is today it's a silo. It's a proprietary data model, and with WinFS it would allow that same bottle of wine, that same customer, to be able to be seen in your Outlook contacts, to be able to be seen in the file system itself.

QUENTIN CLARK: Imagine the integration you could do with a wine -- and there are little wine apps built on Access and those kinds of things today -- but imagine then you could provide features in that wine app like, I know I drank this bottle, but I don't remember who I had it with. Suddenly that's possible because I can take these instances of these wine bottles, relate them to dinners I had or where I brought the bottle and who else was in attendance there, and suddenly these relationships between the data become exposable and become interesting. That kind of gets back to the point I was making about, I think the killer apps will be the things that actually take advantage of the fact that this data now exists and exposing it in ways that are, at this point, unexpected to people. No one's expecting if they installed a wine management app to have, somewhere in the UI, a who I shared this vintage with before button. That's very unexpected to people today, because they don't have that shared data and they will in this future.

ANIL NORI: I think there are two fundamental things I believe that are exciting to me. One is the fact that it is a platform. Second thing is the fact that there are schemas. They grow sort of over time, because we think that there will be a community of schemas that will be developed. Microsoft will probably maybe, we'll come out with maybe 20 schemas, or 25 schemas, but we hope that more ISVs, maybe in the restaurant business, somebody defines the food schema or maybe the ingredient schema or the billing or whatever it is, and it creates a small cottage industry around that where people will start writing more schemas and then more applications. Because in the end, the content and then the schemas is what will drive these applications.

QUENTIN CLARK: Yeah. Imagine the network effects possibilities of these things, where the cottage industry defines some stuff around the restaurant experience. One of those app ISVs decides to build a consumer side of this thing. Now as a consumer, I have this notion of restaurants and restaurants that have built or are using these applications on WinFS can communicate back to me so that I understand where I've been and what I've done and which bottles of wine I had and what menus I liked and didn't. Suddenly it allows me as a consumer to start to understand things like, Oh, this restaurant has this special because they've been maintaining this data about me and now they can send this to me and I can go back and say, Have I been there, when's the last time I was there? And it really allows the sort of next level of interconnecting things that really hasn't been possible before, as time rolls on here.

ROBERT HESS: I think we're about running out of time now. Are there any final words you want to leave our audience with that you think is important about WinFS then to understand about the architecture and some of the time that you guys have put into designing it?

QUENTIN CLARK: Anil?

ANIL NORI: Yeah. I think there are four or five things in the overall architecture to me. The key thing is the data model itself, as I mentioned, the ability to put structured data, the relationships, the extensibility, the logic that you can add. Second thing is the schemas that come with the platform. Then the third thing is the sharing aspects of it. There's a lot of stuff that gone into enable sharing. In some ways, I feel like sometimes we bend backwards to enable sharing. It's a very tough problem. As we mentioned, we'll find out more once data goes and then how people will share that. Then finally, the whole active mechanisms like, we have a synchronization mechanism, we have notifications, when something changes you get notified. You can write tons of applications where on the client side, some data changes now I can show what the new data that is coming. When you put all of them together you can write a rich variety of applications.

QUENTIN CLARK: I'd echo his comments and emphasize a certain point. We've built this data model to solve a fairly broad set of problems. Most of them, frankly, focused on allowing ISVs to have new innovations and new opportunities on the platform. Part of that is we've done some backward compatibility work and some integration to the file system so we can handle things like file streams correctly. We can handle allowing your existing PDF documents to be opened by Acrobat Reader in WinFS without having to have some massive gearshift. But that capability is not where the newness is, it's not where the innovation's gonna come from. The innovation will come from leveraging the new data model, leveraging the new ability to relate things and query on things and structure things. And so my challenge to ISVs has been and was at the PDC and will continue to always be, deep, go deep. Think about the data model. What does it do for your app? How does it really fundamentally change your app? When you start to look at that and you figure out what more atomic pieces of data that really belong to the end user you can now put into the model that way, what does that mean and what does it mean for the shell experience and how do you want to inform the user experience when they're in the shell, not just in your app in doing those kinds of data. It's an exciting time, and there'll be a big growth opportunity around that innovation. It's definitely very exciting.

ANIL NORI: Another thing is, to me, hey, this is the beginning, because we have a much larger vision where we have server plans, and also the next versions of WinFS. And I think we just started it now.

ROBERT HESS: When Longhorn comes out you're not just gonna shut the door and go home?

QUENTIN CLARK: No, probably not.

ANIL NORI: I'm all set for the next decade.

QUENTIN CLARK: Yeah. Anil and I are looking at quite a few years ahead of us. We've got plenty to do, without a doubt. Yeah. I think that's an important point. As time goes by we'll of course be able to reveal details of the rest of the planning work we're doing, but we have a very rich story, and most of the questions that we see down on the blogs are good questions. It's like, yeah, we're developing answers for these, we just need to kind of get our ducks in a row and get our Ts crossed before we can really kind of go out there with those plans.

ROBERT HESS: Well, thanks very much for joining us. I appreciate talking with you, and you gave some really good insights about what WinFS is and what you think developers can and should be doing with it.

QUENTIN CLARK: Great. Thanks, Robert.

ANIL NORI: Thanks.

ROBERT HESS: Well hopefully that gave you some insights about the architecture of WinFS and what Quentin and Anil have been doing the last couple years trying to make sure your problems with file structures is being solved. After this brief break we're gonna come back and take a look at some programming capabilities and see how actual code is written to work against WinFS.

DIVERSIONARY TACTICS

Man on the street interviews... virus software awareness.

ENTER THE PROGRAMMER

Robert Hess meets with Mike Deem, Lead Program Manager for the WinFS Programming Model.

ROBERT HESS: Welcome back. Now, previously we had a really good discussion about what the architectural aspects are of WinFS and why your applications might want to use it. Now we want to actually dive into some code and show you how, in programmatic fashion, you can access some of those capabilities we were talking about. To show us that, I have with me Mike Deem. Mike?

MIKE DEEM: Hi.

ROBERT HESS: Now, what exactly do you do for WinFS?

MIKE DEEM: I'm a Lead Program Manager on the team that's responsible for building the managed APIs on top of WinFS, the APIs that applications use to actually interact with WinFS.

ROBERT HESS: Now, you specifically said the managed APIs. Does that mean there's gonna be some unmanaged APIs for accessing WinFS?

MIKE DEEM: No, we're doing all managed. One of the things that we really want to do for Longhorn and around WinFS is really emphasize the managed APIs. There's a whole lot more value add in that space then doing COM APIs or the good old fashioned Win32 DLL-type APIs.

ROBERT HESS: Now, I know that application developers are gonna write Longhorn applications, can write standard apps and they'll run on Longhorn just fine. Does that mean that if there's anything out of WinFS they want to have access to they're gonna have to write that in a managed wrapper for their application, or will there be anything they can touch on that's in an unmanaged code?

MIKE DEEM: What they can do is use some of the really neat features that they're doing in the Managed C++ compiler in the Whidbey release to do these hybrid apps, applications that are mostly native and then can interact with managed code as well. That seems to be solving most of the problems. We're actually using that a lot in building the shell and other things in Longhorn, so it's working quite well.

ROBERT HESS: Now, in writing code for WinFS and following the managed model of getting the .NET Framework it started on and stuff like that, are you finding that the code is much easier to think through and develop than it was previously with standard NTFS calls?

MIKE DEEM: It's certainly a lot easier to use. Figuring out the best way to use all the features of the managed code platform, especially some of the new features with generics and some stuff that's coming in the Whidbey release, is kind of tricky. There's not a whole lot of precedent for how to apply some of those technologies, but we can use those to really improve the overall usability of the API. We're learning a lot about our platform, about our managed platform, in developing this API.

ROBERT HESS: Now both Quentin and Anil mentioned quite a bit that the structure of WinFS is gonna be more like a database kind of model with queries and schemas and so forth like that. In programmatically accessing these features, are we treating it like a database and having the old Access, and ADO, and all those other type of acronyms for accessing database stuff or is more like a true .NET Framework API model?

MIKE DEEM: It's definitely a true .NET Framework API model. There's a real difference in the philosophy behind database-style APIs and file system-style APIs. In a file system-style API you tend to open a file, maybe lock it, do a bunch of read/writes on the file and then close it. You work more or less one file at a time. Whereas with database APIs, it's more of a query, modify, and submit type mode of operation. The database-style of APIs take into account things like that the back end is an intelligent store. It can run stored procs and you can you have transactions and there can be triggers that do side effecting behaviors. Also they tend to take into account that the data is likely to be remote from the client, which gives you much better failure modes when you are in a distributed environment than what you get with file IO.

I don't know if you've ever used an application across the network that's doing file IO, and then something goes wrong with your network and the application sort of just burps, it hangs. Usually in a very unpredictable way, freezes the UI, it just sort of stops. But with a database-style API, you tend to pull data over into your working set, work on the data there, and then when you're all done you push it back. So you have these very discreet failure points in your application. That allows you to build applications that work much better in a networked environment. So what we're doing for the WinFS API is definitely following more the database-style API. Like you say, WinFS is a database; it's a database that can store files and make the metadata in those files queryable. But it really is a database, and so by adopting more the database-style API we give you all the advantages that that brings, the ability to do the remote interaction, the ability to take advantage of transactions and things along those lines.

ROBERT HESS: In other words, it's not just still a binary stream with just a bunch of metadata attached to it; we're actually treating even the binary aspect of the data itself as being capable of being more a database sort of model.

MIKE DEEM: Yeah. Anil and Quentin talked a lot about the schemas, and what we can do, because we have those schemas and we know what this data looks like, we can give you a very strongly typed API. So rather than dealing with rows and columns, which you would do with something like ADO.NET, you're actually dealing with like a Person object, or a Track object or a Document object. You're interacting at that level. Much more sophisticated, much more capable than doing the rows and column type things. A lot less code to get the job done.

ROBERT HESS: Well, let's take a look at some of the code you've prepared for us.

MIKE DEEM: Okay. What I've got here, a really simple application. I actually wrote it this morning. I wanted to just get across some of the kind of high-level points about the WinFS API, about how it works. One of the first things that you'll do when working with WinFS is create an ItemContext. You open an ItemContext here. What this is, is the context that you interact with WinFS. What I'm doing here where I've just said, Open and not passed any parameters is I've said, I want to connect to the local default store on this computer. I could have put a UNC path to a remote computer in there. I could have also listed multiple things. An ItemContext done like a database connection object, which represents a connection to a single database, an ItemContext can represent a connection to multiple WinFS stores. When you interact with it, it actually does queries against all those stores and pulls the data back and merges them together. It's a bit more sophisticated than just a database connection. But in a lot of ways you can think of it like that.

using ( ItemContext ic = ItemContext.Open() )
{
	ListAllDocumentsWithTitle (ic, "%Cache%");
	CountAllDocuments(ic);

	// AddContacts(ic);
	// SetupEmploymentRelationships(ic);
	// RemoveContacts(ic);

	// CreateRichApplicationView(ic);

	ic.Close();

}

So I create one of those here, and then I'm gonna do some different things. I've commented out some of the calls that I don't want to run quite yet, and then when I'm all done I'm gonna close the ItemContext. So that's the basic pattern around an application that interacts with WinFS. The first thing I did in here was, I wanted to list all the documents that have the word "cache" in the title, so we're gonna look at this here. Here's the code for listing all the documents. What I did is I passed in the ItemContext that I opened, and the next thing I do is I get a Searcher for documents. This is a pattern that shows up over and over again in the WinFS API. I take the type name of what I want to search for and then I do dot, and there's static methods on it, called GetSearcher. There's lots of different varieties of these. Now, Document is one of these high-level classes that we created from the schemas. If you look in the Object Browser here, we can see that there's a whole lot of these classes in WinFS; we'll look at Contacts here. Inside of Contacts we've got Group, we've got Persons in here, Household, Organization, these are all classes that ship with the platform.

static void ListAllDocumentsWithTitle (ItemContext ic, string titlePattern)
{
	Console.WriteLine("All Documents Matching '{0}'", titlePattern);

	ItemSearcher searcher = Document.GetSearcher(ic);
.
.
.

ROBERT HESS: The default schemas, correct?

MIKE DEEM: The default schemas that ship with the platform. You can discover what kinds of information WinFS can contain just by using the Object Browser and IntelliSense in Visual Studio. This is all enabled out of the box; you don't have to do anything to create new schemas or install new schemas to use these classes. What I'm gonna do here is I wanted to do a search for all documents, and then I'm gonna do a filter where the title is "like," and if you're familiar with kind of database-style SQL queries, "like" is an operator that does a pattern match. Then I'm gonna say I have a parameter that I'm gonna call Pattern, the @ sign introduces a parameter here, and then I'm gonna set the value of that parameter here. So Pattern and then the title pattern that I passed in. The next thing that I do here is I actually execute the search to find all the documents that are in here. We can go ahead and run this and see the results here. I actually found, how many is there? Maybe 10 or so documents that have "cache" in the title, and you'll notice some duplicates. We've actually got almost 4,000 documents on this system and you saw how quick that search happened. These documents are spread across a number of different folders, they aren't all in the same folder, and title is data that would be embedded inside of the document. When the documents were put in WinFS, we went in and we pulled that information out, put them in the tables in WinFS, and allowed me to execute the query across those. It came up with this answer very quickly, much quicker than you could do with a search across the file system.

.
.
.
	searcher.Filters.Add("Title like @Pattern");
	searcher.Paramaters.Add("Pattern", titlePattern);

	foreach(Document d in searcher.Findall())
	{
		Console.WriteLine("   {0}", d.Title);
	}

	Console.WriteLine("");
}

ROBERT HESS: Essentially you've then got 3,893 files on your system that have a type of document. Now, that doesn't just mean with a .doc extension on them, but the type that's internally marked with the metadata is on the document-type file. And for that search to happen you basically searched every single one of those files to find out whether or not you had captured it or not.

MIKE DEEM: Yep. And it includes all different kinds of documents, not just doc files.

ROBERT HESS: So it even could have been PowerPoint files and Word docs, Word Perfect docs and stuff like that.

MIKE DEEM: Mm hm.

ROBERT HESS: Now, from a speed standpoint, are we really trying to pay an awful lot of attention to the performance stuff, because that was highly performant and we're expecting that same level of performance to happen with the rest of types of searches people do?

MIKE DEEM: Yes. Performance is a critical feature of WinFS. We're being compared to file system performance. When you go into a folder and you type "dir," that happens really quickly. People expect that type of performance out of their file system. In WinFS, for similar scenarios, if I want to list all the documents in a folder, we need to have similar performance. But where WinFS really wins is to compare it to the performance of if on XP let's say I go into Explorer and do Search and type in a string there, where it's opening up every file and searching through it, that can take a long time, especially if it's doing that recursively through a big directory structure, whereas with WinFS, it's the same speed as doing "dir."

ROBERT HESS: Right. That was a lot faster. If I were gonna do the exact same thing on Windows XP right now with 4,000 documents scattered across my hard drive, it wouldn't come up anywhere near as fast as that.

MIKE DEEM: Absolutely not, yep.

ROBERT HESS: Okay.

MIKE DEEM: Let's look at the next thing that I've got in here. We'll comment out these because we've looked at those and don't want to do them.

AddContacts(ic);
SetupEmploymentRelationships(ic);
RemoveContacts(ic);

WinFS stores a lot of data in addition to files. Contact information, like Persons and Organizations and Households and things like that, aren't represented as files at all. But WinFS can store those as well. What I'm gonna do is actually show some code that goes in and creates some Contacts and does some other interesting things with them. When you create an item in WinFS, you always have to put it in a folder. Folders are kind of the overall organizational structure. I know when you were talking to Anil and Quentin, you guys talked a little bit about the tradeoff between having that organizational structure and then just having all the items just in a big pool of items. When you actually go and create an item, you have to put it in a folder. Just like a file, if you delete the folder that a file's in, the files and the folder go away.

static void AddContacts(ItemContext ic)
{
	Console.WriteLine("Adding contacts...\n");

	Folder parentFolder = UserDataFolder.FindMyPersonalContactsFolder();

	Folder childFolder = new Folder();
	.
	.
	.

What I'm gonna do here is I want to get a folder that I can put my Contacts in. There's a standard folder in the system called My Personal Contacts folder. What I'm doing here is I'm getting that Personal Contacts folder. Then I'm gonna create a subfolder under that that I'm gonna use for my Contacts, and I'm just gonna call it Example Contacts. In order to create a new item in WinFS -- and folders are items just like everything else -- I just say New Folder. Then I go and I say parentFolder. And what this collection is doing is it represents the relationships between the folder and the items that are in the folder.

"When you think about the file system and how you have files in a folder, you don't think about relationships. It doesn't come to mind that there's this thing that sits between the file and the folder. There actually is. There's a directory entry in there that has the information about the fact that that file's in that folder. Relationships are the thing in WinFS that plays that same role. This collection gives me a way on this folder object to manipulate these things. Unlike the file system though, there are many, many, many different kinds of relationships in WinFS. Rather than just saying like folders members here, we have to have a way of expressing the fact that there can be many different types of these relationships. In this case, I'm gonna work with the folder member relationships. I'm gonna add an item to that collection, the childFolder that I'm gonna add, and I'm gonna give it this name in that folder. You might notice that I've got a name here and a name up here and they happen to be the same name in this case. Every item in WinFS has a display name. That's kind of the name that you see in the shell when you're interacting with the WinFS through the shell. This other name is the name that has to be unique for that item within its folder. We're working on some things in the system where you wouldn't have to put both of these names in, but right now you do.

.
.
.
childFolder.DisplayName = "Example Contacts";
parentFolder.OutFolderMemberRelationships.AddItem(childFolder, "Example Contacts");
.
.
.
ic.Update();

When I do this Add, I've actually just made this change in memory. This is really getting to one of the key points in WinFS, where everything that I do with the WinFS API doesn't cause a round-trip back to the store. That's one of those differences between a File System API and a Database API. With a File System API when I do a read or a write, barring any caching that is going on, I am actually interacting with the disk at that point. In WinFS, when I do something like add this item to this collection, it's just in memory, nothing has actually been saved yet. It will actually get saved when I call this down here, ic.Update.

ROBERT HESS: So you as a programmer, are controlling when that actually happens, where in the files of today, with the Lazy Read, Lazy Write sorts of stuff, the programmer really can't control it unless he wants to manually just do a flush at a certain point in time.

MIKE DEEM: Yeah. And what's even more important, is you can do many more operations in memory and then say, okay, now I've got everything that I want and I want to push all those changes back out to the store. That's exactly what I did here. I want to create the folder I'm gonna put my Contacts in, and then I call this method, which I'll show you in a minute, to create two different Person items and an Organization item. Only then do I want to save all these. Now if I were really writing a real app, I would want to put my try catch around this update, because that's where my network errors can occur. That's where I need to pop up the hourglass dialog. That's where I need to tell the user, Hey, I'm doing something now.

CreatePerson(childFolder, "Mike", "Deem", "123", "456-7890", "microsoft.com", "mikedeem");
CreatePerson(childFolder, "Robert", "Hess", "987", "654-3210", "microsoft.com", "roberth");
CreateOrganization(childFolder, "Microsoft", "123", "456-7890");

I've got my folder, now I'm gonna go and create these Person items. I just wrote this method that encapsulates this so I didn't have to write it over and over again. To create a Person item in WinFS, I just create a new Person object, set the DisplayName, and I'm setting the DisplayName from the parameters that I'm passing in here. I'm also gonna store the FullName of this person. This is an example of some of the advantages of having these rich schemas in WinFS. Sure, I could always just set the DisplayName to FirstName, LastName. Or I might set it to LastName, FirstName. Or I might set it to any combination of those things. But if I wanted to do a query for all people who have a certain last name, I need to have a place in the schema, in the storage, that stores just the last name. That's what this FullName thing and personal names are about. I create a FullName object and a FullName object encapsulates the GivenName and the Surname, and we can look at this here. There are some utility APIs in here, but it also has things like MiddleName and Nickname and pronunciation keys, suffixes. It's a complete schema of all-around names. The other interesting thing to notice in these schemas is that you don't have just one FullName for a person, you actually have a collection of FullNames for a person. That recognizes the fact that some people have more than one name. The schemas can't just limit you to that one thing. That kind of illustrates some of the work that we're putting into designing these schemas, being able to account for all these varied scenarios.

static void CreatePerson(Folder folder,
	string givenName, string surname,
	string phoneAreaCode, string phoneNumber,
	string emailDomain, string emailUserName)
{
	Person p = new Person();
	p.DisplayName = givenName + " " + surname;

	FullName fullName = new FullName();
	fullName.GivenName = givenName;
	fullName.Surname = surname;
	p.PersonalNames.Add(fullName);

ROBERT HESS: And some of that comes because of the fact that you aren't being the application. You're trying to be the platform applications build upon, so you can't just think about the specific needs you have for your particular situation. You've got to think about what's Outlook gonna want? What's Groove gonna want? What are the other applications want? What other applications we don't even imagine right now might possibly want.

MIKE DEEM: Exactly. Yeah. Even if we did leave something off that somebody really wanted, there is the ability to write extensions, to actually add your own types to WinFS. I've mentioned a couple times about how that we generate these classes from the schemas. So the APIs, all these objects that we're looking at, FullName and Person and things like that, we go through this step where we take these schema documents, which are these XML documents, and we actually run them through a code generator and spit out all these classes. The real advantage of taking this approach, other than that we can have lots and lots of these things, is that we get consistency across all this data, where the way that I work with a Person item is the same as the way that I work with a Track item or a Document item. It also gives us the ability that as developers add their own schemas to WinFS, we're gonna ship this same tool that we used to generate the APIs that we ship, and they can run their schemas through these tools, generate their own APIs. Those APIs will have the same basic overall patterns as the ones that we ship and they'll totally integrate in with our infrastructure. What this does is provide us a whole lot of leverage to help developers learn the APIs and to be very productive with them. Once you've got the feel for the WinFS API, you can do all kinds of stuff in WinFS with it.

Here I've added my FullName; I'm gonna do something similar with TelephoneNumbers. Again, we've got this idea that well, we've got the AccessPoint, which is the whole concatenated together TelephoneNumber. We also want to identify what type of thing that's going into here. You may notice that what I've called this is a PhoneEAddress. I've created an instance of type TelephoneNumber. We can go into the Object Browser here, and we can look at the fact that TelephoneNumber is actually derived from ContactEAddress. We could go and look at ContactEAddress and see that ContactEAddress is derived from EAddress. There's many different types of EAddress. EAddress stands for electronic address. There's many different types of EAddresses in the system. There's e-mail addresses, there's CAT addresses, there's phone numbers, there's pager numbers. We've taken the time to figure out how these different types of data relate to each other and structured the schemas in such a way that there's this basic EAddress concept, there's types derived from EAddress to represent different specializations of that. It's extensible in that if you come along as a developer and say, Well, the EAddresses that you built into the system are great. I want to add my own for this new type of thing that doesn't even exist yet, there's a way to do that within the schemas. You don't have to just go off and invent your own separate item type. You can derive a type from EAddress and it just integrates right in with everything else that's in the system.

TelephoneNumber phoneEAddress = new TelephoneNumber();
phoneEAddress.AccessPoint = String.Format("{0}-{1}", phoneAreaCode, phoneNumber);
phoneEAddress.AccessPointType = new Keyword("POTS);
phoneEAddress.AreaCode = phoneAreaCode;
phoneEAddress.Number = phoneNumber;
p.EAddresses.Add(phoneEAddress);

ROBERT HESS: Kind of plays well with others and that sort of thing.

MIKE DEEM: Yep. What I've done here is I've set the AreaCode and the phoneNumber and then I'm gonna add it to the person's EAddresses collection. Similarly, for the e-mail address that I have here I pass to this routine, my username and domain, and I add those things to the e-mail address. Then to the folder that I passed in, I'm going to add the item that I want to this folder. In this case, I'm using the DisplayName.

	SmtpEmailAddress emailEAddress = new SmtpEmailAddress();
	emailEAddress.AccessPoint = String.Format("{0}@{1}", emailDomain, emailUserName);
	emailEAddress.AccessPointType = new Keyword("SMTP");
	emailEAddress.Domain = emailDomain;
	emailEAddress.UserName = emailUserName;
	p.EAddresses.Add(emailEAddress);

	folder.OutFolderMemberRelationships.AddItem(p, p.DisplayName.Value);
}

Now it's interesting. You might notice that I've got this DisplayName.Value on here whereas you think this DisplayName's nice simple type. It's actually -- let's see if I can make IntelliSense do the trick here for me. What DisplayName is, is a nullable type. This is a case where we're using generics, a new feature in the CLR, in the Whidbey release of CLR, that you can think of as similar to templates in C++. It's a way of parameterizing types.

People who are really familiar with CLR will know that there's basically two broad categories of types in the CLR. There's value types and reference types. Value types, you can't be null, whereas reference types, you can always assign a null to. Where in WinFS, in the schemas, we mark each property as to whether or not it's nullable or not. In this case, DisplayName is a string, but we've also said that it's a nullable string. If people that are really familiar with CLR will know that well, string is actually a reference type, so you can assign null to it. But if this had been, say, the BirthDate, which is another property on here, that's a value type. It's a date time in CLR, so I couldn't have assigned null to it even though it is nullable. This is an example of how we're doing some interesting things with generics in order to add to CLR this concept of nullability and make it a first class concept. In fact, we're doing some really interesting work with the languages to make it so that it's even more of a first class concept than it is in these current builds. Eventually, I'll be able to get rid of this .Value on here.

ROBERT HESS: And it would just simply know by the context...

MIKE DEEM: It would just simply know that it's a nullable type and it would do null propagations and other interesting things.

MIKE DEEM: The other thing that I added up in that other routine was an Organization. It's very similar to how I added a Person item, some of that consistency.

static void CreateOrganization(Folder folder, string name,
	string phoneAreaCode, string phoneNumber)
{
	Organization o = new Organization();
	o.DisplayName = name;


	TelephoneNumber phoneEAddress = new TelephoneNumber();
	phoneEAddress.AccessPoint = String.Format("{0}-{1}", phoneAreaCode, phoneNumber);
	phoneEAddress.AccessPointType = new Keyword("POTS);
	phoneEAddress.AreaCode = phoneAreaCode;
	phoneEAddress.Number = phoneNumber;
	o.EAddresses.Add(phoneEAddress);

	folder.OutFolderMemberRelationships.AddItem(o, o.DisplayName.Value);
}

Also an Organization and a Person are both instances of a Contact, and it's interesting - when you talk about the WinFS type system with people, a lot of times you say, Well, we're working with Contacts or we can store Contacts. Most people think, Well, that means Person. I can have a Person. But what we've done is we've recognized that there's this base-type Contact and then derive from it there's many different types, like Person and Household and Organization, and they all share the common attribute of being contactable, which essentially means that they have EAddresses. I'm using essentially the same EAddress stuff to save this in here. I'm gonna create two Person items, one that represents me and one that represents you, and an Organization that represents Microsoft. Then I'm going to list the Contacts that I've put in this folder. I'll look at how that works.

	CreatePerson(childFolder, "Mike", "Deem", "123", "456-7890", "microsoft.com", "mikedeem");
	CreatePerson(childFolder, "Robert", "Hess", "987", "654-3210", "microsoft.com", "roberth");
	CreateOrganization(childFolder, "Microsoft", "123", "456-7890");

	ic.Update();

	ListContactsInFolder(childFolder);
}

The things to notice in here are that I'm doing this Searcher again, same pattern. I've got a type. In this case it's not an item; it actually is the FolderMember type, and that represents the relationship between the folder and the things that are in the folder. It's that same collection that I was manipulating when I did this Add up here. In order to list the things in the folder, I want to actually do a search based on Folder Membership Relationships. In this case, I don't really care about the relationships themselves. I just want to kind of skip over them and get right to the items that are in there. That's what this version of the GetSearcher lets me do. GetTargetSearcher says, I want to search across the targets of this relationship and I want to narrow down the scope of my search to be just the folder I care about, and to also narrow it down to just the Contact items that I've put in that folder. Then I execute the search with the FindAll, and I write out the DisplayName for these. Let's go up and go ahead and run this. I'm actually going to comment out these other two things. We haven't looked at those yet. That code ran and I listed the Contacts that I've added there.

static void ListContactsInFolder(Folder folder)
{
	Console.WriteLine("All Contacts in '{0}' folder:", folder.DisplayName);

	ItemSearcher searcher = FolderMember.GetTargetSearcher(folder, typeof(Contact));

	foreach (Contact c in searcher.FindAll())
	{
		Console.WriteLine("   {0}", cDisplayName);
	}

	Console.WriteLIne("");
}

ROBERT HESS: What you ended up with onscreen wasn't anywhere near as exciting as the code itself.

MIKE DEEM: No.

ROBERT HESS: More fun looking at the code.

MIKE DEEM: Yeah, I'm sure that my colleagues from the Avalon Team, whenever I show these demos -- for those of you who might now know, Avalon's the new, fancy UI component that's in Longhorn -- whenever I show these demos that are just this console app, they're like, Oh no, don't do that. But really, when you start out with a new API that's not a UI API, what do you do? You write a console app.

ROBERT HESS: Yeah. At least it shows that the console app doesn't go away in Longhorn.

MIKE DEEM: Yeah.

ROBERT HESS: You still have it.

MIKE DEEM: Clearly, you can always make the UI fancier, but we're talking about the APIs here.

So the next thing I wanted to do, we'll comment this out because I can't add the same Contacts more than once. The next thing that we're gonna do is set up some relationships between these Contacts that I've created.

SetupEmploymentRelationships(ic);
RemoveContacts(ic);

What I've got here is I'm gonna find my Organization that I created and the items that represent you and I. Here I'm using kind of a shortcut for this GetSearcher. The GetSearcher's a powerful pattern. It's used all over the place. But sometimes you just want quick and dirty type searches. This FindOne is kind of a shortcut where I want Organizations I'm passing in a filter here similar to the way that I filtered down all the Documents. I'm just going for the Organization that has the DisplayName. As it turns out, I know because I've manipulated the demo data here that there's only one such Organization. There could be more than one, and if this were a full-blown application, I'd probably pop up some interesting UI that they're building into the shell called the ContactPicker. That's essentially a File Open dialog for Contact items. Instead of doing this FindOne here, I could configure the ContactPicker so that it would find Organizations and I'd tell it that I just wanted one back and it would pop up and let me browse around and look at all this stuff and just return the one Organization. Similarly, I'm finding the two Person items.

Now I want to create these relationships that say that we both worked for Microsoft. Similar to the FolderMemberRelationships that we had, now we have EmploymentRelationships. You may have noticed the Out on this. Relationships are directed. They go from an item to another item. You always attach a relationship to the Source item, to the From item. The relationship is essentially part of the From item. If the From item goes away, the relationship gets deleted. It's interesting and kind of surprising to some people. When we looked at Employment and the EmploymentRelationship, we had to make a choice. Is the From item the Organization or is it the Person? At one point it was the Organization and right now it's the Person, and it's kind of taking the position that Longhorn's a client release; it's end user-focused. The center of the universe are the actual people. That's kind of the reasoning.

static void SetupEmploymentRelationships(ItemContext ic)
{
	Console.WriteLine("Setting up employment relationships...\n");

	Employment employment;

	Organization microsoft = ic.FindOne(typeof(Organization),
		"DisplayName='Microsoft'") as Organization;
	Person mike = ic.FindOne(typeof(Person),
		"DisplayName='Mike Deem'") as Person;
	Person robert = ic.FindOne(typeof(Person),
		"DisplayName='Robert Hess'") as Person;

	employment = mike.OutEmploymentRelationships.AddEmployer( microsoft,
		RelationshipMode.Reference );
	employment.EmployeeTitle = new Keyword("Lead Program Manager");

	employment = robert.OutEmploymentRelationships.AddEmployer( microsoft,
		RelationshipMode.Reference );
	employment.EmployeeTitle = new Keyword("Group Manager");

	ic.Update();

	ListEmployeesOfOrganization(microsoft);
}

There's also some security implications with the information that goes on the relationship itself. I kind of alluded to that when I talked about how there's a directory entry between a directory and a file in the file system, or that the relationships themselves are objects and they have their own properties on them. Every item in WinFS has an ACL on it just like a file in a file system, and that determines who can read and write the data on that item. It also determines who can read and write the data that's on the relationships that are owned by that item, the From items. What we wanted to do is say that the security on the employment data is owned by the employee rather than the employer.

This is an example of one of these. I'm gonna set the title on here. In order to add the employee, I essentially say, Add Employer, and I pass in Microsoft, and I also am passing in identifying which mode of relationship that I want. WinFS relationships have three modes; Holding, Reference, and Embedding. Holding controls the lifetime. It's how things go into folders. When I added things to folders, I didn't have to specify this because it's always Holding, whereas this relationship could be Reference or Embedded. Some scenarios you want to take and essentially embed the Organization inside the Person, so to speak; it's just local data for them. Sometimes you want to share an Organization between many different People. In this case I want to share the same Organization because we're both employed by the same Organization, so I'm doing a Reference relationship. I add two of those and again, I do the update when I'm all done. Then I'm gonna list all the employees of the Organization. Again our Searcher idea, exactly the same pattern that we did for FolderMembership. In this case, I'm getting the EmployeeSearcher for a given organization. Then I'm going to loop through all those and write out the DisplayName that we've got. We'll go ahead and run this. That actually set up those relationships and then showed that we were both employed by Microsoft. There are a whole lot more People than just us employed by Microsoft.

ROBERT HESS: Thank goodness.

MIKE DEEM: Yeah.

ROBERT HESS: Otherwise we've got a lot of work to do.

MIKE DEEM: Yeah. Okay. I didn't comment out the RemoveContacts. We'll look at that really briefly here. Remove, all I did was I went in and remember that Example Contacts folder that I created inside of My Personal Contacts folder? I actually just removed that from the parentFolder here. Just like in the file system, sort of, the act of removing the folder causes the things in it to get removed.

ROBERT HESS: Does that mean we're fired now?

MIKE DEEM: Yeah. I suppose, well Microsoft's gone too, so.... Unlike the file system, which actually prevents you from deleting a folder that has files in it -- there's some higher level APIs that you might be able to call that does it as one step, but as far as the file system's concerned, you're not allowed to do that -- WinFS doesn't make that distinction. It just says, Okay, I'm removing this folder, and anything that's in that folder that's not in any other folder will also get removed. That's a very important point. The fact that I put these Person items and the Organization item in my ExampleContacts folder was just my choice. I could have also added those exact same items to more than one folder. If I had done that, and then I went and removed my Example Contacts, those items would still be in the system. Here, I'm kind of cheating. I know that I only put them in one place so I know that they're gone when I remove the folder that I put them in.

static void RemoveContacts(ItemContext ic)
{
	Console.WriteLine("Removing Contacts...\n");

	Folder parentFolder = UserDataFolder.FindMyPersonalContactsFolder(ic);
	parentFolder.OutFolderMemberRelationships.Remove("Example Contacts");

	ic.Update();
}

ROBERT HESS: Otherwise, you'd have to take and iterate through them and find out what other folders are they in and then remove those folders they're in.

MIKE DEEM: Yeah. At this point, we've kind of looked at all the core concepts in the WinFS API. There's not that many of them. You do queries. You modify the data. I didn't actually show that, but I could do a query for a Person item, I could change, add a middle name and save it back to the store just by calling ItemContacts.Update, and I can manipulate relationships. A lot of times I query through relationships or using them, I can manipulate them, I can do all kinds of different things. Those are the basic programming patterns. Most of the richness in the WinFS API and in WinFS has to do with the types that we ship with the system. The fact that I can go in and write similar code to manipulate albums and artists and music tracks as I did to manipulate Employer/Employee relationships is really the real power of WinFS. I don't need to show you each one of those different things for you to get the idea that this is how the API works. But we go way beyond that. The WinFS API is kind of the API for WinFS. It has to support very rich, sophisticated applications as well as those quick and dirty applications that I just want to bang out and play around with something or some little admin task I have to take care of.

ROBERT HESS: That's kind of different than the way it is with database programming, because in database program you have several different API models, whether it's ADO, or RDO, or Jet, or whatever, and you kind of pick the right one for the difficulty or complexity of the programming you're wanting to do. But you're now saying that the WinFS API needs to be all things to all people.

MIKE DEEM: Yep. That's a real challenge. What it does is make us decide, we've got this relatively simple API surface. You've seen the basic concepts at this point. Those work great in many, many applications. But at some point, you get to the point where I need something a little more sophisticated. I need to be able to work across many, many thousands of items and be able to display those in a grid and interact with them in rich, sophisticated ways. Or I need to be able to monitor WinFS for changes across a bunch of items and react to those things asynchronously. We've provided functionality in the WinFS API for doing those sorts of things as well.

The first one, the interacting with a kind of a grid-style over a lot of data, is something that we've called Rich Application Views or RAV. We keep trying to change that name. Rich Application Views is too long but RAV just sounds so cool. It's kind of stuck. What I'm gonna show here is how I set up one of these Views across almost 4,000 documents that I've got in my system.

static void CreateRichApplicationView(ItemContext ic)
{
	int pageSize = 10;

	Console.WriteLine("Creating View...\n");

The idea here is that many applications, the shell, you look at an email client like Outlook, you look at applications like Money or Quicken, many applications start up with this grid view as this central pane. And usually what you're seeing in that grid is just a small subset of all the data that's available, and it's sliced in two ways. It's sliced vertically where I say, I only care about these columns of data. It's also sliced horizontally in that I'm only looking at this particular page of the data. In addition, on top of that view that I've established over all my data, I often want to do additional filters where I want to type down to particular things or I want to resort it in different ways, or I want to be able to group it and, say, in this case maybe I want to group documents by authors. The idea behind the rich view support in the WinFS API is to do all the databasey, mungey, low-level work that it takes to actually build these very high level GUI-intensive views. It's not a view control. It's not a grid control. We're not building UI. What we're doing is building the infrastructure that you can use and you can bind it to your Avalon UI or you can bind it to WinForms or write your own UI on top of it.

When I want to view, the first thing I do is I say start with a Searcher. I'm gonna make a View across all the Documents, same Searcher pattern we've seen before. Now I'm going a step further where I'm gonna say I want a ViewDefinition and I'm gonna pick out the fields that I want in this ViewDefinition. I want the DisplayName; I'm gonna call it DisplayName and it's pulled from DisplayName. Author here, the Author of a Document is actually stored off in a relationship. This OutgoingRelationships collection is like those other ones, Out, FolderMemberRelationships, and stuff like this. But this one's a collection of all relationships owned by this item. I'm going to filter down all those relationships based on its type with this, and then I'm going to pull out the data in this relationship; I'm going to subselect within that data, the role where is Author, and then I'm gonna get the DisplayName.

// 1. Get Searcher -- for finding all documents
ItemSearcher searcher = Document.GetSearcher(ic);
			
// 2. Define View
ViewDefinition viewDef = new ViewDefinition();
viewDef.Fields.Add("DisplayName", "DisplayName", true);
viewDef.Fields.Add("Author",
	MAX(OutgoingRelationships.Cast(
	System.Storage.Core.DocumentAuthor).Data[
	Role[Value='Author']].DisplayName)");

What I've done is essentially said, I want my View to have an Author column, but the data for that Author column comes from this relationship that's off to the side of the actual Document. Whereas, when I was manipulating the objects before you saw how I pulled back the Contact objects themselves, or the Document objects themselves, I didn't have the relationships loaded in memory at that point, just the Document. Here I want data from both of these things. I want the DisplayName from the Document and the Author from the relationship. Category's another thing that's often outside of the item itself. In this case, it's in something called Extensions, an extensibility point on Item, and there's a Keywords Extension, and within there, there's a Keyword Value. I'm gonna pull back the category here.

Now right now in our Views, we have to do this aggregation function around these. It says MAX. Eventually, we're gonna support -- what that does is limit the result to a single Author so it kind of sorts them alphabetically and takes the last one in this case. In the data that we actually have, there's just one, so it kind of works. Eventually, it's not going to be limited in this way in that the View that I have, this particular cell, the Author cell here, can come back with a collection of Authors. If there's more than one Author of a Document or if there's more than one category in a Document, I get back the collection of those things. If you think about a database view, it's this flat grid; you get scalar values in each of the cells, whereas with the Rich Application Views, the cell can be a scalar value but it can also be a collection. It can also be a whole object. I could have said that instead of just getting the Author's name, I want all the Author relationships, and get back those whole objects in my View. Similarly, I'm pulling out title, and the time that the Document was created, and the time it was last modified.

viewDef.Fields.Add("Category",
	"MAX(Extensions.Cast(
	System.Storage.Core.ItemKeywords).Keywords.Value)");
viewDef.Fields.Add("Title", "Cast(System.Storage.Core.Document).Title", true);
viewDef.Fields.Add("Created", "CreationTime", true);
viewDef.Fields.Add("Last Modified", "LastModificationTime", true);

The next step, once I've defined what I want in my View, is to actually create the View. The Searcher tells me I'm creating a View across Documents. If I had put filters in here, like I did before where I filtered it down to just the Documents with a certain word in the title, then the View would be just across those particular Documents. I'm shaping the View based on my ViewDefinition that I've got.

// 3. Create View
ApplicationView appView = ApplicationView.CreateView(searcher, viewDef);

And then I'm going to go the next step and actually pull data out of my View. That's done with something called a VirtualViewRecordCollection. It's virtual in the sense that it looks like a collection. It looks like it has all 3,800 some rows in it for all the Documents that I've got, but it really only pulls into memory a few at a time. In this case, I've said way up at the top, I think I set the page size to 10 or 20 or something like that. I'm pulling back just that many. No matter where I jump around in this VirtualViewRecordCollection, it's pulling back just the page that contains the record that I'm asking for. If you can imagine if this were a UI app, I could take this VirtualViewRecordCollection and I could bind it to a List Control or a Grid Control of some sort. We handle all the paging behind that. You can ask the Record Collection for Help for the count of the number of records, and you can set your scroll bar size and you know where you are in there so you can get all the thumb positioning on the scroll bar and all that kind of stuff.

// 4. Retrieve data by VirtualViewRecordCollection
VirtualViewRecordCollection vvrc = new VirtualViewRecordCollection(appView, pageSize);

So if you've written database applications using ADO.NET or something like that, it's very, very difficult to write. It takes a lot of lines of code to kind of bridge this gap between what my UI looks like and how I get data back from the database. What Rich Application Views are all about is bridging that gap, doing it once and for all, in a way that works for every application that wants to run on top of WinFS.

The next thing I'm gonna do is actually pull out the individual View Records. I'm just gonna pull out two of them at random, number 312 and 2532, and print out the data there.

ViewRecord vr;

Console.WriteLine("View Records: ");

vr = vvrc[312];
Console.WriteLine( "   312: {0}, {1}, {2}, {3}, {4}",
		vr[0], vr[1], vr[2], vr[3], vr[4] );

vr = vvrc[2532];
Console.WriteLine( "  2532: {0}, {1}, {2}, {3}, {4}",
		vr[0], vr[1], vr[2], vr[3], vr[4] );

I'm going to comment out this last little bit, which is getting into another really cool feature, and go ahead and run this so that we can see kind of the results at this point. Creating the View in this case takes a couple seconds. The actual act of creating the View involves creating a temp table in WinFS and caching that data. We're working on infrastructure that lets me get back the first page of View data very, very quickly while the View gets created in the background. That gives you a very responsive feel to your application. I created the View and I pulled out those two records from there. Pretty straightforward stuff.

The next thing I want to do in here, one of the cool things about Views is the ability to do Groupings. If you're familiar with an application like Outlook, how you can group all your mail messages by subject and then by sender, so you get this hierarchical grouping capability, and you can expand, collapse those Groups. You can also think of it as just like a tree controller where you can expand, collapse things. Getting this big, flat list of all the documents is interesting and powerful, but that's not usually how you want to see them. You want this grouping capability. You want to be able to group by Author in one case and Title in another case. Rich Application Views also support groupings.

I'm gonna define a Group here that's based on Author and I'm gonna pull back an aggregated value. I'm gonna count the number of Documents under each Author, and I've defined this as a Group. I'm adding it to my ViewDefinition, and then I'm going to regroup my View. At this point, I've already created the View. By the way, the fact that I created the View and created this temp table, that temp table went away when this application ended. These are transitory things. But when I run it again, I'll create the View again, pull out those two records, and then I'm going to regroup the View. I could have set the grouping up when I initially created the View, but what this is illustrating is that once I've paid that cost of setting up the initial View, the user can interact with my UI and change the grouping, change the sort orders, change all kinds of things about how they want to see the data, and you don't have to go back and recreate the entire View. We just remanipulate the data that we've already pulled out.

I'm gonna start off with the View in a completely collapsed state. Now I'm just going to iterate through all the records in my VirtualViewRecordCollection and I'm gonna pull out the record count and the Author and the number of Documents the Author has written within each Group. There's actually quite a few of them here.

// 5. Group Data in View

Console.WriteLine("Regrouping View...\n");

ViewGroup vg = new ViewGroup("Author");
vg.AggregateFields.Add("GroupCount", "Count(DisplayName)");
viewDef.Groups.Add(vg);

appView.ReGroupView(viewDef.Groups, null,
	ExpandCollapseState.CollapseAllGroups);

Console.WriteLine("Groups:");

for(int i = 0; i < vvrc.Count; ++i)
{
	Console.WriteLine("  {0}: {1}, {2}", i,
		vvrc[i]["Author"], vvrc[i]["GroupCount"]);
}

Console.WriteLine( "" );

It's creating the View again, and it scrolled through all those pretty quickly there. I can see here that Ted Hart wrote two of the Documents. Now the test data that we've got on this machine is very varied. In order to get together 4,000 Documents, we basically took maybe a couple hundred and copied them over and over and over again and some things like that. So you get some interesting results sometimes. But that kind of illustrates how I go and I set up the View. Now I have another application here that's kind of an implementation, or uses the same Views that I illustrated in the code here.

ROBERT HESS: Except not in console mode.

MIKE DEEM: Not in console mode. It actually uses a really straightforward WinForms grid. We built this application to test some of the timing aspects and you can see kind of down at the bottom here where we're reporting timing information. As we're making design choices, we prototype how we want to do something in order to implement these Views and then we use this application and say, well, was that better or worse? What I've done here is I've got a View over all the Documents that are in the system, okay?

ROBERT HESS: Where do you want people to be watching exactly?

MIKE DEEM: These drop-down controls will disable and the display will refresh. What you'll see is when I first hit this button, the first page of data comes up very quickly. Then when the rest of the View is created in the background, we refresh the UI here. It's just an artifact of the way we've implemented it. You can do UIs that would make that transparent, but it kind of gives you this idea of how we can pull out this initial page very, very quickly and then fill the rest of the View in the background. I've got the first page and the View is now created at this point.

Now I can go in and we'll set up a grouping here. We'll do the Group By Author. Now I've created my Group By Author, I'm showing my page size right now as 10, and I'm showing the first of 20 pages, and I can page through that. I can expand out one of these Groups. You see how when I expanded this out, I saw all the Documents within the Group, and apparently he has written a lot of Documents. I don't know where his came to an end, but he wrote a lot of Documents. Now there's only four for this person. A much more reasonable number. When you do this expand/collapse, we're actually remembering -- let's see, I can expand this and go back up here and then go back down a page and we've actually remembered this one's expanded out. Now since the View remembers what's expanding and collapses, it adjusts the record count in that VirtualViewRecordCollection. Again, I can take that whole collection, I can bind it to my UI, my scroll bar's exactly how I need it to be, it knows exactly how many things are in that collection, and then as they expand/collapse, that count changes, my scroll bar stays up to date. It's actually really cool. Many, many hundreds of lines of code that anyone has written sophisticated database applications, I'm sure has had to crank out on more than one occasion.

One of the other cool things that we've got on Views is the ability to filter over the data that's in the View already. I can set up my initial View, has all the Documents in it, and then I want to be able to, in this case, I want to type a word that I want to find in there. What we do is, as I type these characters, we're actually doing an additional query against the view that I've got and pulling out just the records that match that. In this case, I'm looking in a number of fields, in Title and DisplayName, and a number of different fields. You can choose which fields you want to look in. You can see what I've done as I typed is I was querying against those 3,000 Documents and narrowing it down progressively to just the Documents that have cache in the title. You can see as I do that, the set there changes.

ROBERT HESS: I mean, this is pretty exciting because this shows an actual application that's using some of the features that you put into WinFS.

MIKE DEEM: Mm hm.

ROBERT HESS: While it may not be the killer application we were talking about earlier...

MIKE DEEM: The shell's doing the same stuff. It just looks a whole lot nicer.

ROBERT HESS: But WinFS clearly, I think, has a lot of capabilities that application developers are going to need to use in order to really supply their users with information. Because I mean face it, disk drives now are getting in the gigabyte size range easily, whereas it used to be just a single floppy you had to deal with. That means an awful lot of data that we're using constantly, and we're always finding more and more information we can stick on our machines. Something like WinFS, I think, is pretty important for caching those things. You personally, what do you think is the most exciting thing about WinFS?

MIKE DEEM: I don't know. I keep thinking about WinFS as something that I want to be able to use, as Longhorn as something that I want to be able to use in my everyday life. I'm a Program Manager at Microsoft. That essentially means that I'm always juggling ten different tasks at the same time, interacting with 20 different people and a bunch of different e-mail threads, giving presentations to people, having lots of meetings to go to. Trying to keep track of all that information, who I talked to about what, and who's seen which documents, and who I'm waiting for review feedback from various documents. I can use WinFS and I can use Longhorn to make my job easier. As I'm working through these design things, I'm like -- what do I want? How do I need this to work for me, to make my job easier? I think just overall, what we're trying to put together with WinFS, and to make it so I can relate these items together and navigate between these relationships, and find the information that I want, is just incredibly powerful and enabling.

ROBERT HESS: I would expect a lot of people in the audience, after having heard Quentin and Anil talk about the underlying architecture and some of the capabilities that it's applying and now, specifically, after seeing you and this code that you can do with WinFS and Longhorn, they're probably getting kind of excited, so I want to know how do I get a hold of this further? How do I actually write some applications? What are some sources of information that you can recommend people go to now in order to find out more information about developing WinFS applications for Longhorn?

MIKE DEEM: There's a lot of information on the MSDN Microsoft Web site. There's a complete version of the Longhorn SDK, the version that we made available to attendees of the PDC. You can take a close look at the API, the WinFS APIs, as well as a lot of other APIs within that documentation. What you'll see there is a little bit different than the API that I showed now. We've progressed some since the PDC and we've got a whole lot of new things in mind that we're gonna make this API even simpler, easier to use, more powerful. I mean that's a really good source of information. There's a lot of blogs out there on WinFS.

ROBERT HESS: Plus you've got your own blog, perhaps?

MIKE DEEM: Yeah, I've got my own blog, but I haven't been updating it as often as I should, so maybe I'll get back to doing that.

ROBERT HESS: Maybe if users of this show want you to start your blogging more, they can leave postings on my blog, on blogs.msdn.com/theshow, and let you know, and maybe you can get your own blog moving over again.

MIKE DEEM: Yep. Yeah.

ROBERT HESS: Thanks, Mike, for joining us. I appreciate the information you've showed us in the code you had there, and I'm sure it helped our audience quite a bit.

MIKE DEEM: All right. Thank you.

ROBERT HESS: Hopefully, that gives you an idea of what WinFS looks like from a programmer's perspective. You saw how simple some of the APIs were, how you could quickly, easily, open up a context to query that context or information to display that context. You also saw some glimpse of how you could even further get some further advanced capabilities and really dive into some of the information that WinFS is holding and display it on the screen in fairly rich fashion, even in a console application. Thanks for joining us.

Somebody @ Microsoft.com

Erica Wiechers, Program Manager, Microsoft Corporation interviews Jeremy Mazner, Technical Evangelist for Longhorn in the Platform Strategy Group.

ERICA WIECHERS: With me today is Jeremy Mazner. He's a Technical Evangelist focusing on WinFS. Jeremy, welcome.

JEREMY MAZNER: Thanks for having me.

ERICA WIECHERS: You're welcome. Why don't you start by telling us what it's like to be a Technical Evangelist, maybe in general at Microsoft, and then also how you focus especially on WinFS.

JEREMY MAZNER: Oh, we always get in danger when you're trying to describe anything in general at Microsoft because you know, every team's going to do things differently. For the last year, up until PDC, we spent a lot of time as evangelists working with the product teams and really understanding what they were planning to build and thinking about how we would help them talk about that to developers, to our developer community, whether they're ISVs or Enterprise developers. That stage sort of culminated in PDC. My team got to have a role in helping out with the keynotes and helping out with some of the overview talks and really trying to concisely explain what Longhorn is, what WinFX is, what we think the value is to developers, and the kinds of applications we hope they build for end users and consumers.

Then after PDC, we've sort of done a reset and we're trying to think, well what should we focus on now? Some of it is getting to go out and actually talk to partners one-on-one, people like Adobe and Amazon and Merc, we all had up at PDC and some additional customers. Part of it is working still with the product teams to understand any changes that they're making, new features; we know that there were a lot of things that weren't available at PDC. Some of the teams talked about their future work, some of them didn't. We're helping them think about their future work. Then probably the last significant thing that we do as a team is work with the Microsoft worldwide evangelism field. There's only maybe 50 or 60 evangelists in my greater organization at Microsoft, but there are 700 or more worldwide. Wherever you're located in the world, there's a developer evangelist nearby and we try and help those developer evangelists understand Longhorn. So if you have questions, there's someone you can talk to locally.

ERICA WIECHERS: Right. What are some of the challenges of trying to evangelize future technologies, or in development technologies, when a lot of your customers are still working on the released current technologies?

JEREMY MAZNER: Oh, there's a lot of challenges. On the one hand certainly, trying to talk anything at all about a product that isn't released yet and still has some time to go -- we don't know how long yet, but some time still -- is always a challenge because things change. Things changed in the year before PDC. I'm sure things are changing after PDC, particularly as we take customer feedback and learn what people like, what they didn't like, what's missing. It's always hard to try and get a message when you know the message is going to be changing, and that what we said at PDC was accurate at that snapshot in time, but it's going to change over time. In some ways, you don't want to be a victim of your success and come out and have everyone remember that Longhorn and WinFX was this particular thing if it's going to end up changing a little bit over time and being something different.

ERICA WIECHERS: That's true.

JEREMY MAZNER: That's always hard as the product changes underneath you. The flip side: there are plenty of customers, solution providers, all the way up to the biggest ISVs, who are very focused on making money today. Talking about a product that's still, like I said, sometime out there, we don't know how far out, is difficult. One of the things that we always try to do is keep an emphasis on how you can get ready today for Longhorn and things you can do with WinForms, and by integrating in with Office, using things like the new Offline Client Block that came out from the Prescriptive Architecture folks, and take some of those steps to succeed today and be able to generate value while still getting ready for Longhorn in the future.

ERICA WIECHERS: Right. Have you been an Evangelist your whole time at Microsoft?

JEREMY MAZNER: I haven't. I'm actually, I'm busily working my way through the entire list of jobs at Microsoft. I've been doing evangelism for about two years. I started off as a developer, so I get a little bit of credibility. I spent four years working on all kinds of things; a little utility called Internet Connection Wizard, a utility called Connection Manager, some parts of Site Server, some parts of SharePoint Portal Server, and then I got to spend a year as a Program Manager on the SharePoint Portal Server Team which was nice to get to step a little bit away from code and more towards working with customers and seeing how they use the product. Then I took a year away from product completely and got to be a PR guy in the Speech Writing organization. And then I've been an Evangelist for two years.

ERICA WIECHERS: Speech writing, did you write for any particular individual at Microsoft?

JEREMY MAZNER: I did. I'm trying to be cagey about it. But Speech Writer's not quite the right word, but I was Steve Ballmer's Presentation Manager. I think I convinced them to make me business cards that said Communications Manager for the Office of the CEO.

ERICA WIECHERS: Sure, nice.

JEREMY MAZNER: But it wasn't, I wouldn't say that anyone writes things for Steve to say. He's quite able to communicate well on his own. But I tried to help when there was a speech coming up, or something like the launch of Office, or the launch of Pocket PC 2002, I got to work on that. Got to sort of communicate between the Product Team back to Steve and try and make the speeches a little bit better where I could.

ERICA WIECHERS: Sure. What are some things that were challenging about that job?

JEREMY MAZNER: Oh he's just a great speaker on his own. Steve could walk into a room completely unprepared, and he understands the technology and his audiences so well that even given no preparation, he gives a great speech. The question is, what can you do as a speech writer to take that already great speech and make it even better?

ERICA WIECHERS: Right. Fine-tune it.

JEREMY MAZNER: That was the toughest challenge.

ERICA WIECHERS: So what did you do before coming to Microsoft?

JEREMY MAZNER: I came straight out of college, but I had a couple summer internships working at some little companies. They both have "book" in the title, strangely, but one was called BookMaker, which made a printing utility called ClickBook that I think is now owned now by Hewlett-Packard. The other was a company called Books That Work that made multimedia titles. They are now owned by Sierra Online. Being startups, I didn't really do one thing in particular. I did a little bit of testing, a little bit of development, a little bit of office management, a little bit of coffee making, I got to design the stationery for one of them.

ERICA WIECHERS: You have seen it all. The whole gamut of all these positions.

JEREMY MAZNER: I'm working slowly but surely.

ERICA WIECHERS: What are some things you like to do outside of work?

JEREMY MAZNER: I enjoy playing music a lot, both jazz and some good 80s rock.

ERICA WIECHERS: Guitar?

JEREMY MAZNER: A little bit of everything, drums, keyboards, guitar.

ERICA WIECHERS: You're just a jack-of-all-trades, no matter what you do, right?

JEREMY MAZNER: Yeah, everything a little bit and nothing terribly well. I enjoy playing music, I enjoy sports. I've recently discovered Wally Ball.

ERICA WIECHERS: What is that?

JEREMY MAZNER: It's volleyball on a racquet ball court. Interesting sport, but you can play a pretty good game with only four people. To play a good game of volleyball with four people, you actually have to know what you're doing. Wally Ball has plenty of room for being sloppy and you can still keep playing.

ERICA WIECHERS: Smaller space.

JEREMY MAZNER: Yep. And of course, I would probably be in trouble if I didn't say I enjoy spending time with my wife, Karen. But she's in grad school so we actually don't get to spend much time together.

ERICA WIECHERS: Oh really, wow. You say you like to play jazz. Are you in a band or do you have anything on the side?

JEREMY MAZNER: Nope, just enjoy jam sessions. I actually don't like being in a band because I feel if I'm really gonna do a performance and ask people to pay, that I owe them a certain level of rehearsal and competence, which I lack. I'm not really that competent, don't like rehearsing, but I love just getting together and playing music. In fact, what was the conference I was at? At one of the conferences, they have these jam sessions. It was at TechEd or MEC, or something like that.

ERICA WIECHERS: And PDC. I think they do them all the time now.

JEREMY MAZNER: I missed it at PDC, but I've been on stage a couple of times, just jamming around with people. That I enjoy very much.

ERICA WIECHERS: That's fun. You'll have to do a cameo in that Don Box's band some time.

JEREMY MAZNER: We'll see. We'll see. They're a pretty high-quality band, so, hard to break into.

ERICA WIECHERS: What would you say is the best thing about your current job?

JEREMY MAZNER: Best thing about my current job?

ERICA WIECHERS: Or one of the best.

JEREMY MAZNER: Other than getting to work with Robert, which of course is an honor and a privilege, and actually getting to work with Quentin and Anil and Mike. Those guys are all great to work with. I think it's actually probably the ability to stay focused in a very broad sense on all of Longhorn, and to get to work with both the people writing the product and the customers who are going to use the product. As a Developer, you're sort of sitting in your office writing code. As a Program Manager, you get to step out a little bit and get to talk to customers and get some feedback, but as an Evangelist, that's actually a key part of my job is to understand both the code and understand customer feedback, and be able to talk to customers and present it to them. So I really enjoy getting to look at sort of everything we're building, and then going out and talking to customers about it.

ERICA WIECHERS: Great. Jeremy, thanks for joining us today.

JEREMY MAZNER: Thanks. I enjoyed it very much.

JEREMY MAZNER: I'll continue to explore the potential for WinFS and its impact on Windows developers at my blog, http://blogs.msdn.com/jmazner , where I'm looking forward to hearing what the dev community thinks of our plans for Longhorn. I'll also be hosting a series of interviews for the new Channel Nine Web site at http://channelnine.net.

Closing Comments

ROBERT HESS: That brings us to the end of another episode of the .NET Show.

ERICA WIECHERS: Stay tuned for the next episode on Avalon.

ROBERT HESS: Until then, we'll see you on the Web.

Links to Further Information

The Longhorn Developer Center: http://msdn.microsoft.com/longhorn

The Longhorn Developer Center: WinFS; "WinFS" is the code name for the next generation storage platform in Windows "Longhorn." Taking advantage of database technologies, Microsoft is advancing the file system into an integrated store for file data, relational data, and XML data. Windows users will have intuitive new ways to find, relate, and act on their information, regardless of what application creates the data.

The Longhorn SDK: http://longhorn.msdn.microsoft.com

Introducing "Longhorn" for Developers: Chapter 4: Storage; Author Brent Rector gives you a close-up look at WinFS, the new storage system in Longhorn. Thanks to WinFS, developers can spend less time developing data access and storage code and more time working on unique application functionality.

Mike Deem's Blog: While he hasn't updated it for a while, you can check out Mike's blog here: http://anopinion.net/; Drop in, read what he has to say, and ask him to start posting again!

Jeremy Mazner's Blog: You can check out Jeremy's blog here: http://blogs.msdn.com/jmazner

Jul	AUG	Sep
	03
2003	2004	2005

Longhorn and WinFS