CURRENT_MEETING_REPORT_ Reported by Alan Emtage/Bunyip Minutes of the Uniform Resource Identifiers Working Group (URI) The notes for these minutes were taken by Craig Summerhill and edited by Alan Emtage. Administrative Issues The minutes from the Houston meeting were approved. The chair then proposed some changes to the original agenda that had been posted to the URI list. After some debate, the proposed changes were accepted. Overview Karen Sollins made a short presentation. An overview document describing the Internet Information Access Architecture needs to be written, and that work should be authored under the auspices of the IIIR Working Group. This working group will review the document. The chair proposed a brief enumeration of possible services and actions of UR*s. Some felt that general enumeration is already taking place in the IIIR architecture document, and did not think that should happen here in this meeting; however, others thought it would be useful to go over the definitions that we developed at the Houston meeting. They were at the time fairly brief, but generally agreed upon. Alan enumerated three potential services. (--> is a process, mapping or application.) 1. General lookup URN --> URLs URCs 2. Reverse lookup URL --> URNs URCs 3. Characteristics/attribute matching URC --> URNs URLs It was asked if one considers a word processing document and the PostScript rendering of that document to be the same thing? Is that part of this overview? The response was that this had been discussed in several previous discussions and would be addressed in the URN requirements later. Functional Requirements for URNs Karen Sollins and Larry Masinter presented the current Internet-Draft on Functional Requirements for URNs. Functional requirements of URNs in the document are: o Global scope (applicable globally) o Global uniqueness (unique over the network) o Persistence (should have lasting value - not transient) o Scalability (should be able to scale) o Grandfathering (must grandfather existing systems) o Extensibility (future extensions must be possible) o Independence (sole responsibility of naming authority) o Resolution (will not impede resolution to a URL) Requirements for encoding: o Single encoding (visible string) o Human transcribability (humans should be able to read/transcribe) o Simple comparison (two URNs should be comparable) o Transport friendliness (can be transported within standard Internet protocols such as SMTP, FTP, etc.) o Machine consumption (machines can parse the data) The subject was raised of the removal of the issue of URN ``sameness'' from the current draft when it was in previous drafts. Sollins and Masinter responded that they had looked at the issue and it was seen to cover separate issues which were either handled in other points or are not going to be covered: o When are two names `the same name' but with spelling variations? This is covered with `simple comparison.' o When are two resources with different names the same? (When is `the morning star' the same as `the evening star?') o Can you treat things with the same name as the same? (When is `the morning star' the same as `the morning star?') o The naming authority gets to decide whether two items are the same same or not. Within their domain, they can assign the same URN to two different data formats with the same intellectual content, if they choose to do so. Other requirements: o Name assignment is delegated to naming authorities (which are registered). o Naming authorities are encouraged to provide scalable naming, but it is not required. o Naming authorities should guarantee mapping to URLs, but does not need to provide this service themselves. o URNs must be built with a limited character set in order to be transportable. o Naming authority must abide by these constraints. There was an issue of caching that has been dropped from the document, but there are going to be times in electronic publishing when you need to know that a URN applies to data format ``a'' but not data format ``b.'' The concept of Location Independent File Names (LIFNs) needs to be addressed. This was raised on the URI list previously. The issue of ``immutability'' needs to be introduced either to URNs or some other construct. Sollins and Masinter replied that caching did get dropped because it was felt that it was not resolvable at this time. However, they believe that they have identified and gained consensus on a common set of requirements. There may be additional requirements, but whether they would appear in this document or a revision has currently not been decided. For example, resource security and authentication needs may warrant additional requirements. At this point, the document is an Informational document, not a standards track document. Also, the ability for the URN to guarantee distinguishing between mutable and immutable documents may be a requirement, but it is not clear that it should be added to this document. Transcribability and limiting of character sets may discourage people from using them. Could a document in Japanese or Chinese have the URN embedded within the document, for example? Sollins and Masinter replied that the requirement was explicitly for transcribability and that it is not a requirement that it be easy to generate a URN from the title of a document or some other character string. Simon Spero proposed an amendment to the section on simple comparison. Would it not be better to say that ``it is the goal to make the algorithm for comparing URNs as simple as possible''? Sollins/Masinter noted that this paragraph may have been munged during the editing process. Spero was tasked to provide alternate wording. This process has to be simple---it has to be local. It should not get munged by mailers, it should be case insensitive, and it should ignore white space. ``Optional punctuation'' is a scary concept. A URN may have to be separated from the environment it lives in. It was suggested that the difference between encoding and presentation be addressed. Sollins and Masinter said without a specific proposal to judge this, they did not want to limit it. The algorithm should be simple, but it should also be immune to the kinds of transcription errors that people make. Again, without a specific proposed standard it is difficult to make judgement on this issue. The issue of LIFNs was again raised and it was remarked that this document does not come close to that. A need for a name such that a series of octets is immutable. These location independent file names need to be distinguishable from URNs too. Sollins and Masinter suggested that the group start with this document, and if there is a proposal for this set of constraints plus other added functionality, bring it back to the group to be considered. They believed that the group wanted something that would describe the core set of requirements for a URN which may or may not be a LIFN. The chair noted that this document does not prevent the creation of immutable URNs either, it simply does not require it. The question of a singled ``canonical encoding'' of a URN was raised by Mitra. Larry Masinter noted that if you have a simple comparison function, you also have the ability for a canonical representation and so it was left it out of this document. Mitra responded that the definition of sameness would need to be addressed more specifically and John Curran noted that having a canonical representation is going to be important when we begin to develop implementations of URNs. The issue of canonical representation was unresolved. There was a proposal that ``in free text representations, URNs should be recognizable as such.'' The group agreed. Keith Moore requested that the definition of URC in the Internet-Draft be made clearer. Erik Huizer, Applications Area Director with primary responsibility for this group, suggested a short time limit in revising this document and having it go through to the RFC Editor. John Curran added that it should be put out on the net as a document which represents the ``rough'' consensus of the group, and that we are looking for wordsmithing but the concepts stand. Keith Moore disagreed. However the group noted that since this was an Informational RFC it can be re-issued easily as other requirements become clearer. Erik Huizer suggested that in any case the document be sent to the IESG for a Last Call. The group agreed. Chris Weider and Peter Deutsch had previously proposed a functional specification document for URNs, and it is now an Internet-Draft. He was asked by the chair to present possible elements of a URN to the group in light of the requirements document draft. In his proposed scheme a URN has four elements: 1. wrapper or a tag 2. URN, easily distinguished in text 3. (hierarchical) naming authority identification 4. opaque string John Curran noted that the Masinter/Sollins document specifies `centrally registered' authorities and that you cannot have both centrally registered and hierarchical. Mitra suggested that the Domain Name System (DNS) is in fact both, depending on which part of the authority is examined. Tim Berners-Lee noted that the URN proposal contained two components, an authority and an opaque string allocated by the authority. He would prefer something that is hierarchical like DNS. He suggested that we make the boundary between the naming authority and the opaque string flexible so that the boundary becomes invisible. Chris Weider said that the current specification has three parts: 1. Scheme ID (e.g., IANA) 2. Authority 3. Opaque string with a visible boundary between the elements. Mitra had some concerns and presented an example where the naming authority and a hierarchy would not have a boundary with the opaque string. IANA:Collins.UK.12345.n UK here can be viewed as part of the naming authority or part of the opaque string. Use of DNS to resolve the URN (locate a resolution service) being part of the scheme. o In this scheme, on day one, IANA does not become the sole naming authority. Initially Collins is the authority, then IANA:Collins.UK becomes the naming authority as things start to move around. When the resolution changes, you cannot remove the ``Collins,'' but you can move the place where resolution occurs. o Simon Spero said that this is already very similar to ASCII representation of distinguished names (RFC 1485). Some members disagreed with Mitra's proposal: o Michael Mealling was concerned that DNS might die if TCP/IP goes away and ATM takes over. He wants URNs to work after DNS goes away, and does not want references to DNS in the specification. o Clifford Lynch said that the idea of a sliding divider in URNs is a bad idea which comes from one representation of how URNs are going to work. In this example, if you start with Collins as the naming authority and later decide to subdivide Collins into multiple smaller naming authorities you are going to have problems. Mitra responded with the example: IANA:COLLINS_UK: John Curran disagreed saying that somebody's input form is going to break because the application is processing for a canonical name (even if it is not in the specification). Further discussion was directed to the mailing list. Charter The chair proposed that the current working group charter be reviewed in the light of the current work. The group agreed. The chair has been tasked to propose a revised charter. The group decided to start the discussion of URLs rather than URCs. Functional Requirements for Internet Resource Locators John Kunze was asked to present his draft functional requirements document on URLs. These requirements were generally agreed to in Houston. Before presenting the document he made the following points: o The document was sent to the list, and no comments were received. o He has purposely avoided the use of any reference to `uniform resource locator' in this document since it was supposed to ``exist'' before the functional specifications document. o There are a few definitions such as uniqueness, which the author felt were necessary. The core requirements for URLs: o Transient (they will change) o Global scope o Parsable (machine consumable) o Distinguishable o Transport friendly o Transcribable o Will include service parameters o Extensible o No information other than that required to access object Craig Summerhill proposed that wording be brought into line with the other functional requirements documents. General consensus was yes, where possible. The general consensus was that the document should go forward, after any synchronization of wording with the Masinter/Sollins documents that might be needed. Kunze will send it to the list in the same time frame as the Masinter/Sollins document. URL Functional Specifications Draft The group discussed the current draft of the URL functional specifications document by Tim Berners-Lee. As a starting point in the discussion, the chair noted these areas of contention: o Gopher+ protocol `?' type o CWD slashes in FTP (directory problem) o file types in FTP Mark McCahill presented his suggested syntax for the Gopher+ URL. The issue is that we need to be able to refer to Gopher+ within the URL in order to parse queries that are being done with `?' in the Web/HTTP implementation. He said that we could code the question mark as %4F or %09 (whatever the appropriate encoding is) in the protocol, but would like it to look prettier for obvious reasons. o Tim Berners-Lee said that the the real question is whether this is a function of the Gopher+ protocol or is it part of the URL syntax? If it is part of the URL syntax, then it should be a question mark. o John Kunze had a problem with the question mark having global meaning, because we have a requirement that the string to the right of the service identifier is opaque. He also has a concern about using the `?' to get typing information, and thinks that is something better left for the URC. o John Curran said that anything that falls to the right of the service identifier is opaque, but did not see anything wrong with having a mapping that would make the opaque string easier to work with. o Tim Berners-Lee said when you are using a `?' then it is not opaque. In the WWW, he has generally gone along with this because that is what people wanted in this group General consensus was to go with Mark McCahill's proposal. Keith Moore raised the concern of having Gopher URLs being able to spoof other protocols and wanted wording in the document to alert implementors of this problem. o Mark McCahill noted this is not a problem specific to the Gopher URLs since other URLs can do this. o John Curran did not want to see anything that addresses a security concern in the specification for URLs themselves. He suggested that the group ennumerate these security concerns in an additional document, but did not want to see constraints placed on the specification in the RFC. o Reacting to Moore's comment about disallowing URLs to point at a port other than that assigned for the particular access method Mitra noted that there are perfectly legitimate reasons to point a Gopher server at other ports. o Keith Moore maintained that there should be a specific passage within the Gopher section of the document pertaining to security. Tim Berners-Lee noted that in fact, there is at the back of the document, but Moore wanted it in the section where the syntax for the Gopher URL is presented. He was tasked by the group to produce appropriate wording for the document within two weeks of the meeting. The chair presented the issue that in the FTP URL, the group long ago decided that although the forward slashes may look like a UNIX path, they are really characters that delimit the components in the directories statement and the terminal object is the file (object) itself. So, there is an issue: What does ftp://host/a/b/c mean? (a) CWD a; CWD b; RETR c or (b) CWD a/b; RETR c Larry Masinter was chief proponent of scheme (a), Keith Moore for scheme (b). The following points were made: o In an Andrew File System, you cannot issue CWD a and CWD b, you must issue the CWD a/b command. o The slash is a delimiter, it is not part of the path. o What if some of the directories are nested with security, and the user does not have permission to read it (like directory ``b'' in this example)? o Does anybody have a case where they need to do multiple CWD commands? o If we use scheme (a), we have the option of issuing multiple CWD commands. The other option gives us only the choices of 0 or 1 CWD command. o Does anybody know of an existing practice? What are existing applications doing with these things now? There was rough consensus that the specification should mandate multiple CWDs. The client should do multiple CWDs, and if one CWD can be issued it needs to be encoded or quoted. For example, slashes would not be used ``in the clear'' where they are delimiters, but hex encoded with a slash used for delimiting the directory structure from the retrieved object. Wordsmithing will be done by Larry Masinter and sent to the list for approval. Huizer was asked by a number of group members if he would be part of the review process. He declined by saying that as part of the IESG he will be asked to review it there. Tim Berners-Lee was asked to post the current draft and the revised draft as an Internet-Drafts. Berners-Lee has agreed to this. Harri Salminen asked about the proposed URLs for news. He proposed non-lower case names in Usenet newsgroups. Some of these newsgroup servers can have spaces and wildcards and all kinds of characters in these names. o Larry Masinter said that the URL is supposed to refer to the object, and be unambiguous. So, we can leave out the wildcard. If you use a wildcard, you are not referring to a single object. o John Curran said that the fact that the character set is a superset, and people can put these characters in a URL does not mean people will use them. We should just leave this alone. Tim Berners-Lee questioned if the specification for NNTP be in the document since the function requirements state that it needs to be globally unique. The group decided: o The URL is globally unique, but it does not have to be globally accessible. o The URL does not guarantee that it will get you there. We cannot guarantee that it will be globally accessible. Concerning transfer type. We need to be able to encode the types of access required in order for transfer to occur. There are currently four types: IMAGE, ASCII and LOCAL in RFC 959 and directory (not in RFC 959). The issue here is one of syntax. Larry Masinter proposed that the directory ``type'' (since it is not part of the RFC) be specified as a trailing slash. This had the consensus of the group. The proposal is to deal with the others as ``!Type=A'' or ``;Type=I.'' For example: ftp://host/path/document!type=i The following points were made in the discussion: o There is general consensus that this approach is okay. o Should the type information be mandatory in an FTP URL? Should there be a default? o Consensus is that the default transfer type should be ``unspecified.'' o Will the delimiter be the bang (!) or the semi-colon (;)? John Kunze noted that the bang is a problem with the Unix C shell. o Consensus is that delimiting character will be semi-colon. Tim Berners-Lee, Larry Masinter, Mark McCahill and Ned Freed have been tasked to incorporate the required changes and reply to the list within two weeks. Uniform Resource Characteristics Michael Mealling has a draft that begins to talk about the functional requirements of URCs. Erik Huizer asked that it be posted as an Internet-Draft, and Michael Mealling has agreed. Jim Fullton was asked to review the functional specifications for URCs. o Should they be ``characteristic'' or ``citation''? o Larry Masinter remarked that the group had this problem with URLs and URNs because we did not understand enough about what they were, and we ended up changing the title of each of those. He suggested that we are putting the cart ahead of the horse. In overview, the draft functional requirements of the URCs are: o Encapsulation o Structure o Scalability o Grandfathering o Caching o Resolution o Human readable o Transport friendly o Machine consumable A general discussion was held in the remaining time. o Mitra noted that a URC needs to be able to differentiate the URL and URN information from the other meta-information within it, and you need to know what elements of the meta-information pertain to which URL within it. Mealling noted that this was covered under the ``Structure'' in the document. Several people expressed concern about URCs as currently debated. o Keith Moore did not think you can lump all these things into a single structure and make it have any meaning. It is non-optimal at best, and will not work at worst. o Larry Masinter had a concern about lumping these things together, especially as far as attachments are concerned. He did not think that the encapsulation goes nearly far enough towards deciphering collections of objects. o Jim Conklin had a problem with referencing URLs within a URC: the URL is going to be too much in flux. o Clifford Lynch was very troubled by the open-ended nature of these URCs in the absence of any real concrete usages for them. He can see a need to encapsulate information in order to be able to pass it around, but is worried that this enumeration of properties is too abstract to be useful. He is specifically worried about little micro-bibliographies being shipped around in these things. He does not think we share any common usage scenarios among the many of us in this group. There was discussion about whether we can build the framework for how to handle the data until we understand what the things are that we want to put into a URC skeleton. o The chair noted that there was not much interest on the list in talking about scenarios when it was brought up, but will be again. o Larry Masinter suggested composing a very general specification, and proceeding to some scenarios for how these things can be employed. For example, he would like to deal with file data formats and have a syntax for talking about that. o Mitra agreed to repost his previously posted scenarios to the list as an Internet-Draft. New Business o Keith Moore wants to look at defining LIFNs. There was consensus that it was important for this group. o Karen Sollins noted that the Internet has to start caching or getting information much closer to the clients. We are starting to flood the network with queries. She does not think that we are putting the information in the right locations. The chair suggest that this should be discussed at the IIIR meeting. o Considerable amount of overlap between what we are doing and other groups. Should we schedule joint meetings? Erik Huizer said that we should request that other groups formally monitor our mailing lists. o John Curran suggested that now that we have URLs and URNs on track, he thinks we need a container for ``content type'' specification. Also, he thinks we need to begin exploring issues related to mapping services, etc., but is not sure that this is the right group for doing this. o John Kunze said that he thinks we are trying to use a committee structure for solving a lot of interoperability problems, and we need a better structure for providing support to the developers. Closing Remarks The following remarks were given by Erik Huizer. The work that this group is doing is getting more and more important (or rather that more and more people are beginning to realize how important this work is going to be). He has been pushing to get the URL and URN specifications out, and realizes some people are feeling put off by this, but he believes that the group needs to get more communication out to people that are building applications with these things already. The group lacks a good overview of the architecture. Almost every person in this room has a different view of what that architecture should be. So, a couple of things that are being discussed: o Start a discussion in the IESG about creating an IETF area for Internet Integrated Information activities. o The IAB holds workshops a couple of times a year for invited people. He has applied for an IIIA workshop on this topic. We cannot invite all members of this working group, but he wants to make sure that all members of this community are represented. Attendees Kevin Altis altis@ibeam.intel.com Farhad Anklesaria fxa@boombox.micro.umn.edu John Beck jbeck@cup.hp.com Alexis Bor bora@ct.si.cs.boeing.com Sepideh Boroumand sepideh@jacks.gsfc.nasa.gov Luc Boulianne lucb@bunyip.com Mic Bowman mic@transarc.com Gregg Brekke gbrekke@mr.net Lloyd Brodsky lbrodsky@rocksolid.com Brad Burdick bburdick@radio.com Randy Bush randy@psg.com C. Allan Cargille allan.cargille@cs.wisc.edu Michael Carroll br.mjc@rlg.stanford.edu Jodi-Ann Chu jodi@uhunix.uhcc.hawaii.edu Jim Conklin jbc@bitnic.educom.edu Naomi Courter naomi@concert.net David Crocker dcrocker@mordor.stanford.edu Glen Daniels gub@elf.com Bruce Davie bsd@bellcore.com Dante Delucia dante@usc.edu Peter DiCamillo Peter_DiCamillo@brown.edu Alan Emtage bajan@bunyip.com Sheryl Erez erez@cac.washington.edu. Richard Everman reverman@ka.reg.uci.edu Patrik Faltstrom paf@nada.kth.se Jill Foster Jill.Foster@newcastle.ac.uk Paul Francis francis@cactus.slab.ntt.jp Ned Freed ned@innosoft.com Thane Frivold frivold@erg.sri.com Jim Fullton fullton@cnidr.org Kevin Gamiel kgamiel@cnidr.org Arlene Getchell getchell@es.net Anders Gillner awg@sunet.se Judith Grass grass@cnri.reston.va.us Sally Hambridge sallyh@ludwig.intel.com Deborah Hamilton debbieh@internic.net Darren Hardy hardy@cs.colorado.edu Art Harkin ash@cup.hp.com Alisa Hata hata@cac.washington.edu Roland Hedberg Roland.Hedberg@umdac.umu.se Alex Hopmann alex.hopmann@resnova.com Tim Howes tim@umich.edu Richard Huber rvh@ds.internic.net Erik Huizer Erik.Huizer@SURFnet.nl Priscilla Jane Huston phuston@nsf.gov Barbara Jennings bjjenni@sandia.gov Marko Kaittola Marko.Kaittola@dante.org.uk Jim Knowles jknowles@binky.arc.nasa.gov Andrew Knutsen andrewk@sco.com Padma Krishnaswamy kri@cc.bellcore.com John Kunze jak@violet.berkeley.edu Sylvain Langlois Sylvain.Langlois@der.edf.fr Walter Lazear lazear@gateway.mitre.org Edward Levinson levinson@pica.army.mil Ben Levy seven@ftp.com Clifford Lynch calur@uccmvsa.ucop.edu Janet L. Marcisak jlm@ftp.com April Marine april@atlas.arc.nasa.gov Marilyn Martin martin@netcom.ubc.ca Larry Masinter masinter@parc.xerox.com Chip Matthes chip@delphi.com Mark McCahill mpm@boombox.micro.umn.edu James McKinney jmck@americast.com Michael McLay mclay@eeel.nist.gov Mitra mitra@pandora.sf.ca.us Keith Moore moore@cs.utk.edu Mark Needleman mhn@stubbs.ucop.edu Clifford Neuman bcn@isi.edu Paul-Andre Pays pays@faugeres.inria.fr Pete Percival percival@indiana.edu Karen Petraska-Veum karen.veum@gsfc.nasa.gov George Phillips phillips@cs.ubc.ca Thomas Powell sestrada@aldea.com Joyce K. Reynolds jkrey@isi.edu Steven Russert srussert@atc.boeing.com Srinivas Sataluri sri@internic.net Rickard Schoultz schoultz@sunet.se Mark Smith mcs@umich.edu Suzanne Smith smith@es.net Karen Sollins sollins@lcs.mit.edu Milan Sova sova@feld.cvut.cz Simon Spero ses@tipper.oit.unc.edu John Stewart jstewart@cnri.reston.va.us Walter Stickle wls@ftp.com Craig Summerhill craig@cni.org Peter Sylvester peter.sylvester@inria.fr Dave Thompson davet@ncsa.uiuc.edu Aleks Totic atotic@ncsa.uiuc.edu Phil Trubey ptrubey@netcom.com Ruediger Volk rv@informatik.uni-dortmund.de Glenn Waters gwaters@bnr.ca Chris Weider clw@bunyip.com Phil Wintering pvw@americast.com