[Bioperl-l] svn auto-properties [was Re: First cut svn repository]

David Messina dmessina at wustl.edu
Sun Jul 1 01:38:48 EDT 2007

> [Nath]
> I think the list of seq formats recognised by Bioperl in Bio::SeqIO  
> and
> Bio::AlignIO would be a good start. As these are likely to be the ones
> that are sensitive to file format recognition and thus could break  
> tests
> if renamed.

Sounds good to me. I will do a quick tour of the rest of the repo  
looking for other common or important file extensions, but I don't  
expect there to be many if any.

> [still Nath]
> I think a lot of people have used "." in file names as an  
> alternative to
> a space. I think it would be beneficial to use an underscore "_" in
> these cases and leave the "." to represent the beginning of the file
> extension.

That's a great idea.

> [Chris]
> Do we need to define every filetype extension, or can there be a  
> fallback (eg if it isn't on the list or has no extension it's plain  
> text)?

For every file that's added, svn takes a peek to see if it's human- 
readable. If not, it's tagged with the generic MIME type application/ 
octet-stream. (It does this so it knows not to try to do diffs and  
merges on a binary file.)

So the default for a human-readable file is no MIME type, which I  
believe is essentially the same thing as text/plain.

And then regardless of the outcome of svn's peek, any matching auto- 
props are then applied, overriding svn's choice.

So if we don't define every extension, I think we'll be fine. It'd be  
nice to have everything tagged with a MIME type, though. For one  
thing, Apache will use it to do the right thing when people browse  
the repo over the web. And two, because metadata is cool. :)

One more thing: in the course of reading up on this, I learned that  
my earlier expectation about multiple auto-prop matches was  
incorrect. It's true that multiple unrelated matches means that  
multiple properties are set on the file. But when a file matches  
multiple *conflicting* auto-property patterns, there's no telling  
which value it'll get.


More information about the Bioperl-l mailing list