Please give advise about my first patch attempt

Thu Aug 27 04:49:51 EDT 2020

Hi,

i am preparing my first patch.
May i ask for review and advise how to make it acceptable ?

The patch shall fix an old regression of cdrom and sr drivers.
Although drivers/cdrom has no mailing list entry in MAINTAINERS, i assume
that such patches should go to linux-scsi at vger.kernel.org which is listed
for drivers/scsi/sr*. Right ?

So the change affects two different subsystems which have the same maintainer
and have a very close relation. Do i need to split it into two, nevertheless ?

The change is relative to git tag v5.8 of
  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
It is not locally committed yet, so i cannot run git format-patch for now.
I plan to do if i get some approval here.

I tested with three different optical drives at SATA and USB by
  dd if=/dev/sr0 bs=2048 count=1 of=/dev/null
Unchanged yields ENOMEDIUM or EIO. Changed yields success.

Please also tell me if my mailer (used with this mail) would cause problems.

--------------------------------------------------------------------------

Affected files:
  drivers/cdrom/cdrom.c
  drivers/scsi/sr.c
  include/linux/cdrom.h

Check passed:
  scripts/checkpatch.pl
reports
  total: 0 errors, 0 warnings, 131 lines checked

--------------------------------------------------------------------------
Intended commit message:
--------------------------------------------------------------------------

Re-enable waiting for CD drive to complete medium decision after tray load

If
  open("/dev/sr0", O_RDONLY);
pulls in the tray of an optical drive, it immediately returns -1 with
errno ENOMEDIUM or the first read(2) fails with EIO. Later, when the drive
has stopped blinking, another open() yields success and read() works.
This affects not only userland reading of the device file but also mounting
the device.

Commit 210ba1d1724f5c4ed87a2ab1a21ca861a915f734 of january 2008 switched
sr_drive_status() away from indirectly using sr_do_ioctl() for sending
TEST UNIT READY commands. sr_do_ioctl() has a wait-and-retry loop for
the drive's intermediate sense reply:
  2 04 01 Logical unit is in process of becoming ready
Since then sr_drive_status() does not wait any more.
By commit 96bcc722c47d07b6fd05c9d0cb3ab8ea5574c5b1 of july 2008 it now
returns CDS_DRIVE_NOT_READY if above drive reply is received.

But the function open_for_data() in drivers/cdrom/cdrom.c expects that
its call of cdo->drive_status() waits until the drive has decided about
the usability of the inserted medium. Any reply other than CDS_DISC_OK
yields ENOMEDIUM.
Even if the drive waits with its reply to START/STOP UNIT until it is
ready (e.g. Pioneer BDR-S09), the first read(2) yields EIO, because the
newly loaded medium gets not properly assessed by check_disk_change()
before sr_block_open() returns.

Fix the problem by a new function wait_for_medium_decision() which gets
called by open_for_data() instead of directly calling cdo->drive_status().
Make sure that scsi/sr.c assesses the newly loaded medium without additional
risk of deadlock.

After loading the tray and due waiting for readiness, cdrom_open() now returns
-EDEADLK to urge its caller to drop its mutex, to re-run check_disk_change(),
and then to call cdrom_open() again. Only scsi/sr.c is for now aware of this.

cdrom/gdrom.c, paride/pcd.c, and ide/ide-cd.c call cdrom_open() too, but i
cannot test them. So their unchanged code will return -EDEADLK instead of
retrying. Not worse or more deceiving than -ENOMEDIUM or -EIO.

Signed-off-by: Thomas Schmitt <scdbackup at gmx.net>

--------------------------------------------------------------------------
Code change:
--------------------------------------------------------------------------

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index d82b3b7658bd..9e113b0dfc82 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -286,6 +286,18 @@
 #include <scsi/scsi_common.h>
 #include <scsi/scsi_request.h>

+/*
+ * For the wait-and-retry loop after possibly having loaded the drive tray.
+ * 10 retries in 20 seconds are hardcoded in sr_do_ioctl() which was used
+ * up to 2008.
+ * But time spans up to 25 seconds were measured by libburn on
+ * drives connected via SATA or USB-SATA bridges.
+ * So 20 retries * 2000 ms = 40 seconds seems more appropriate.
+ */
+#define CD_OPEN_MEDIUM_RETRY_MAX 20
+#define CD_OPEN_MEDIUM_RETRY_MSLEEP 2000
+#include <linux/delay.h>
+
 /* used to tell the module to turn on full debugging messages */
 static bool debug;
 /* default compatibility mode */
@@ -1040,10 +1052,38 @@ static void cdrom_count_tracks(struct cdrom_device_info *cdi, tracktype *tracks)
 	       tracks->cdi, tracks->xa);
 }

+static
+int wait_for_medium_decision(struct cdrom_device_info *cdi)
+{
+	int ret, retry = 0;
+	const struct cdrom_device_ops *cdo = cdi->ops;
+
+	/* Wait until the intermediate drive status CDS_DRIVE_NOT_READY ends */
+	while (1) {
+		ret = cdo->drive_status(cdi, CDSL_CURRENT);
+		if (ret == CDS_DRIVE_NOT_READY &&
+		    retry++ < CD_OPEN_MEDIUM_RETRY_MAX)
+			msleep(CD_OPEN_MEDIUM_RETRY_MSLEEP);
+		else
+			break;
+	}
+	if (ret != CDS_DISC_OK)
+		return ret;
+	/*
+	 * It is hard to test whether very recent readiness can cause race
+	 * conditions with media change events. So wait a while to never
+	 * undercut the average delay between actual readiness and detection
+	 * which was tested without this additional msleep().
+	 */
+	msleep(CD_OPEN_MEDIUM_RETRY_MSLEEP / 2);
+
+	return CDS_DISC_OK;
+}
+
 static
 int open_for_data(struct cdrom_device_info *cdi)
 {
-	int ret;
+	int ret, tray_was_moved = 0;
 	const struct cdrom_device_ops *cdo = cdi->ops;
 	tracktype tracks;
 	cd_dbg(CD_OPEN, "entering open_for_data\n");
@@ -1069,13 +1109,17 @@ int open_for_data(struct cdrom_device_info *cdi)
 					ret=-ENOMEDIUM;
 					goto clean_up_and_return;
 				}
+				tray_was_moved = 1;
 			} else {
 				cd_dbg(CD_OPEN, "bummer. this drive can't close the tray.\n");
 				ret=-ENOMEDIUM;
 				goto clean_up_and_return;
 			}
-			/* Ok, the door should be closed now.. Check again */
-			ret = cdo->drive_status(cdi, CDSL_CURRENT);
+			/*
+			 * The door should be closed now, or in the process of
+			 * closing and assessing the medium.
+			 */
+			ret = wait_for_medium_decision(cdi);
 			if ((ret == CDS_NO_DISC) || (ret==CDS_TRAY_OPEN)) {
 				cd_dbg(CD_OPEN, "bummer. the tray is still not closed.\n");
 				cd_dbg(CD_OPEN, "tray might not contain a medium\n");
@@ -1090,6 +1134,14 @@ int open_for_data(struct cdrom_device_info *cdi)
 			ret = -ENOMEDIUM;
 			goto clean_up_and_return;
 		}
+		if (tray_was_moved) {
+			/*
+			 * Bail out now to let callers like sr_block_open()
+			 * unlock their mutex and run check_disk_change()
+			 */
+			ret = -EDEADLK;
+			goto clean_up_and_return;
+		}
 	}
 	cdrom_count_tracks(cdi, &tracks);
 	if (tracks.error == CDS_NO_DISC) {
diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c
index 0c4aa4665a2f..a26c33c44ab9 100644
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -521,7 +521,7 @@ static int sr_block_open(struct block_device *bdev, fmode_t mode)
 {
 	struct scsi_cd *cd;
 	struct scsi_device *sdev;
-	int ret = -ENXIO;
+	int ret = -ENXIO, tried = 0;

 	cd = scsi_cd_get(bdev->bd_disk);
 	if (!cd)
@@ -529,11 +529,16 @@ static int sr_block_open(struct block_device *bdev, fmode_t mode)

 	sdev = cd->device;
 	scsi_autopm_get_device(sdev);
+retry:
+	tried++;
 	check_disk_change(bdev);

 	mutex_lock(&cd->lock);
 	ret = cdrom_open(&cd->cdi, bdev, mode);
 	mutex_unlock(&cd->lock);
+	if (ret == -EDEADLK && tried <= 1)
+		/* Tray was loaded. Assess medium by check_disk_change(). */
+		goto retry;

 	scsi_autopm_put_device(sdev);
 	if (ret)
diff --git a/include/linux/cdrom.h b/include/linux/cdrom.h
index 8543fa59da72..68d8297e6208 100644
--- a/include/linux/cdrom.h
+++ b/include/linux/cdrom.h
@@ -100,8 +100,17 @@ int cdrom_read_tocentry(struct cdrom_device_info *cdi,
 		struct cdrom_tocentry *entry);

 /* the general block_device operations structure: */
+
+/*
+ * Peculiarity:
+ * cdrom_open() returns -EDEADLK after automatically loading the tray, because
+ * it expects to be called under various mutexes after the callers called
+ * check_disk_change() outside of these mutex acquirations. Callers which get
+ * -EDEADLK should for once re-run check_disk_change() and retry cdrom_open().
+ */
 extern int cdrom_open(struct cdrom_device_info *cdi, struct block_device *bdev,
 			fmode_t mode);
+
 extern void cdrom_release(struct cdrom_device_info *cdi, fmode_t mode);
 extern int cdrom_ioctl(struct cdrom_device_info *cdi, struct block_device *bdev,
 		       fmode_t mode, unsigned int cmd, unsigned long arg);

--------------------------------------------------------------------------


Have a nice day :)

Thomas