Is it possible to achieve a link based FS? #3

Open
opened 2023-09-06 16:04:18 -07:00 by pexch · 3 comments
pexch commented 2023-09-06 16:04:18 -07:00 (Migrated from github.com)

I'm currently using webdav-server but I'm facing problems when passing a direct dl link connected to a file. If the file is too big, the stream is loaded into memory and it breaks. I assume the issue lies in the fact that its trying to load a huge amount of data into memory. My biggest problem is media files such as MP4 etc.

Just wondering if there's a way to only load parts of what the user requested, if its a media file for example?

And is this doable in nephele?

I'm currently using webdav-server but I'm facing problems when passing a direct dl link connected to a file. If the file is too big, the stream is loaded into memory and it breaks. I assume the issue lies in the fact that its trying to load a huge amount of data into memory. My biggest problem is media files such as MP4 etc. Just wondering if there's a way to only load parts of what the user requested, if its a media file for example? And is this doable in nephele?
hperrin commented 2023-11-08 14:28:12 -08:00 (Migrated from github.com)

In Nephele, all file operations are done with streams that respect backpressure, so if there is for example slow network and the client can only receive data at a certain rate, the buffer will stop filling from the filesystem once it reaches the highWaterMark of the stream. This should prevent Nephele from ever running out of memory, unless there is already extremely limited system memory (highWaterMark is only like 16kb by default).

I'm really surprised if that's not also how webdav-server works, because that's pretty standard behavior. That's definitely something they should be doing.

Also, Nephele does indeed support range requests, which will let you load specific parts of a file. So, for example you should be able to jump around in a video that's being streamed from a Nephele server to VLC or your browser.

One thing to note, is that if you want Nephele to be able to handle big transfers, you need to set the requestTimeout on the server to something higher than the default, which is only 5 minutes.

In Nephele, all file operations are done with streams that respect backpressure, so if there is for example slow network and the client can only receive data at a certain rate, the buffer will stop filling from the filesystem once it reaches the `highWaterMark` of the stream. This should prevent Nephele from ever running out of memory, unless there is already extremely limited system memory (`highWaterMark` is only like 16kb by default). I'm really surprised if that's not also how `webdav-server` works, because that's pretty standard behavior. That's definitely something they should be doing. Also, Nephele does indeed support range requests, which will let you load specific parts of a file. So, for example you should be able to jump around in a video that's being streamed from a Nephele server to VLC or your browser. One thing to note, is that if you want Nephele to be able to handle big transfers, you need to set the [`requestTimeout`](https://nodejs.org/api/https.html#serverrequesttimeout) on the server to something higher than the default, which is only 5 minutes.
hperrin commented 2023-11-08 14:28:53 -08:00 (Migrated from github.com)

@pexch Apologies for the delayed response. Hopefully that helps.

@pexch Apologies for the delayed response. Hopefully that helps.
hperrin commented 2023-11-08 15:14:12 -08:00 (Migrated from github.com)

I looked at the webdav-server code, and it does look like they are not respecting backpressure from the network on mutlipart range requests, which is most likely why you're seeing these crashes. If you want to let them know so they can fix it, this is where the problem is:

github.com/OpenMarshal/npm-WebDAV-Server@5f237622e8/src/server/v2/commands/Get.ts (L66)

That write function returns false when there's backpressure, so it should look like this:

github.com/sciactive/nephele@a9881ee9c0/packages/nephele/src/Methods/GET_HEAD.ts (L233-L237)

They need to pause their source stream (reading from the disk) and only resume it once their target stream (sending across the network) drains. They're also going to need to rewrite this code:

github.com/OpenMarshal/npm-WebDAV-Server@5f237622e8/src/server/v2/commands/Get.ts (L226-L265)

Which reads each range into memory before building and sending the response all at once.


Another thing I noticed is that they're reading the entire file when they get a range request, which can hurt performance and wastes disk IO. For every request they create a full read stream:

github.com/OpenMarshal/npm-WebDAV-Server@5f237622e8/src/resource/v1/physical/PhysicalFile.ts (L82)

They can improve performance by creating a ranged read stream for each range. Like this:

github.com/sciactive/nephele@a9881ee9c0/packages/adapter-file-system/src/Resource.ts (L143)

Otherwise, if, say, you just wanted to watch the end credits of a 24GB 4K movie file, you'd have to read through all 24GB, which would be like 30 seconds at SATA speeds, every time you tried to seek during the stream.

I looked at the `webdav-server` code, and it does look like they are not respecting backpressure from the network on mutlipart range requests, which is most likely why you're seeing these crashes. If you want to let them know so they can fix it, this is where the problem is: https://github.com/OpenMarshal/npm-WebDAV-Server/blob/5f237622e81886f0305293810725c07b54a2598f/src/server/v2/commands/Get.ts#L66 That `write` function [returns false](https://nodejs.org/api/stream.html#writablewritechunk-encoding-callback) when there's backpressure, so it should look like this: https://github.com/sciactive/nephele/blob/a9881ee9c096e81f09e17584b744069654d5406c/packages/nephele/src/Methods/GET_HEAD.ts#L233-L237 They need to pause their source stream (reading from the disk) and only resume it once their target stream (sending across the network) drains. They're also going to need to rewrite this code: https://github.com/OpenMarshal/npm-WebDAV-Server/blob/5f237622e81886f0305293810725c07b54a2598f/src/server/v2/commands/Get.ts#L226-L265 Which reads each range into memory before building and sending the response all at once. <hr> Another thing I noticed is that they're reading the entire file when they get a range request, which can hurt performance and wastes disk IO. For every request they create a full read stream: https://github.com/OpenMarshal/npm-WebDAV-Server/blob/5f237622e81886f0305293810725c07b54a2598f/src/resource/v1/physical/PhysicalFile.ts#L82 They can improve performance by creating a [ranged read stream](https://nodejs.org/api/fs.html#filehandlecreatereadstreamoptions) for each range. Like this: https://github.com/sciactive/nephele/blob/a9881ee9c096e81f09e17584b744069654d5406c/packages/adapter-file-system/src/Resource.ts#L143 Otherwise, if, say, you just wanted to watch the end credits of a 24GB 4K movie file, you'd have to read through all 24GB, which would be like 30 seconds at SATA speeds, every time you tried to seek during the stream.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
sciactive/nephele#3
No description provided.